Skip to content

Commit 4ace1d3

Browse files
committed
Comments
1 parent 2295241 commit 4ace1d3

File tree

1 file changed

+37
-31
lines changed

1 file changed

+37
-31
lines changed

src/torchcodec/_core/SingleStreamDecoder.cpp

Lines changed: 37 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1081,32 +1081,17 @@ void SingleStreamDecoder::setCursor(int64_t pts) {
10811081
cursor_ = pts;
10821082
}
10831083

1084-
/*
1085-
Videos have I frames and non-I frames (P and B frames). Non-I frames need data
1086-
from the previous I frame to be decoded.
1087-
1088-
Imagine the cursor is at a random frame with PTS=lastDecodedAvFramePts (x for
1089-
brevity) and we wish to seek to a user-specified PTS=y.
1090-
1091-
If y < x, we don't have a choice but to seek backwards to the highest I frame
1092-
before y.
1093-
1094-
If y > x, we have two choices:
1095-
1096-
1. We could keep decoding forward until we hit y. Illustrated below:
1097-
1098-
I P P P I P P P I P P I P P I P
1099-
x y
1100-
1101-
2. We could try to jump to an I frame between x and y (indicated by j below).
1102-
And then start decoding until we encounter y. Illustrated below:
1103-
1104-
I P P P I P P P I P P I P P I P
1105-
x j y
1106-
1107-
(2) is more efficient than (1) if there is an I frame between x and y.
1108-
*/
11091084
bool SingleStreamDecoder::canWeAvoidSeeking() const {
1085+
// Returns true if we can avoid seeking in the AVFormatContext based on
1086+
// heuristics that rely on the target cursor_ and the last decoded frame.
1087+
// Seeking is expensive, so we try to avoid it when possible.
1088+
// Note that this function itself isn't always that cheap to call: in
1089+
// particular the calls to getKeyFrameIndexForPts below in approximate mode
1090+
// are sometimes slow.
1091+
// TODO we should understand why (is it because it reads the file?) and
1092+
// potentially optimize it. E.g. we may not want to ever seek, or even *check*
1093+
// if we need to seek in some cases, like if we're going to decode 80% of the
1094+
// frames anyway.
11101095
const StreamInfo& streamInfo = streamInfos_.at(activeStreamIndex_);
11111096
if (streamInfo.avMediaType == AVMEDIA_TYPE_AUDIO) {
11121097
// For audio, we only need to seek if a backwards seek was requested
@@ -1129,13 +1114,34 @@ bool SingleStreamDecoder::canWeAvoidSeeking() const {
11291114
// implement caching.
11301115
return false;
11311116
}
1132-
// We are seeking forwards.
1133-
// We can only skip a seek if both lastDecodedAvFramePts and
1134-
// cursor_ share the same keyframe.
1135-
int lastDecodedAvFrameIndex = getKeyFrameIndexForPts(lastDecodedAvFramePts_);
1117+
// We are seeking forwards. We can skip a seek if both the last decoded frame
1118+
// and cursor_ share the same keyframe:
1119+
// Videos have I frames and non-I frames (P and B frames). Non-I frames need
1120+
// data from the previous I frame to be decoded.
1121+
//
1122+
// Imagine the cursor is at a random frame with PTS=lastDecodedAvFramePts (x
1123+
// for brevity) and we wish to seek to a user-specified PTS=y.
1124+
//
1125+
// If y < x, we don't have a choice but to seek backwards to the highest I
1126+
// frame before y.
1127+
//
1128+
// If y > x, we have two choices:
1129+
//
1130+
// 1. We could keep decoding forward until we hit y. Illustrated below:
1131+
//
1132+
// I P P P I P P P I P P I P
1133+
// x y
1134+
//
1135+
// 2. We could try to jump to an I frame between x and y (indicated by j
1136+
// below). And then start decoding until we encounter y. Illustrated below:
1137+
//
1138+
// I P P P I P P P I P P I P
1139+
// x j y
1140+
// (2) is only more efficient than (1) if there is an I frame between x and y.
1141+
int lastKeyFrameIndex = getKeyFrameIndexForPts(lastDecodedAvFramePts_);
11361142
int targetKeyFrameIndex = getKeyFrameIndexForPts(cursor_);
1137-
return lastDecodedAvFrameIndex >= 0 && targetKeyFrameIndex >= 0 &&
1138-
lastDecodedAvFrameIndex == targetKeyFrameIndex;
1143+
return lastKeyFrameIndex >= 0 && targetKeyFrameIndex >= 0 &&
1144+
lastKeyFrameIndex == targetKeyFrameIndex;
11391145
}
11401146

11411147
// This method looks at currentPts and desiredPts and seeks in the

0 commit comments

Comments
 (0)