On 4/15/2013 1:08 PM, Brad O'Hearne wrote:
Hello,

In a processing workflow that includes first encoding a video / audio frame 
followed by writing that frame to a file/network stream, it is required to set 
the packet pts prior to encoding. The subsequent encoding operation then 
returns an output value (gotPacket) indicating whether a packet was returned, 
e.g.:

                 returnVal = avcodec_encode_audio2(codecCtx, &_avPacket, 
_streamAudioFrame, &gotPacket);

A packet might not have been returned, but if one has been returned, there are 
two notable characteristics about it:

1. it is not necessarily the same packet that was encoded; i.e. the encoder can 
return a different packet (presumably due to reordering as necessary, etc.).

2. The returned packet does not have an accurate pts set, and so this must be 
manually set in the code.

When dealing with a video stream, the general approach to setting packet pts is 
to grab the AVCodecContext's pts of its coded frame, and then rescale it from a 
value relative to the context's time_base (that's the one based on frame rate) 
to one relative to the stream's time_base (the one based on time), as such:

_avPacket.pts = av_rescale_q(codecCtx->coded_frame->pts, codecCtx->time_base, 
_videoStream->time_base);

The key point there is that the pts of the frame involved with the packet 
returned is accessible in the context's coded_frame, i.e.:

videoCodecCtx->coded_frame->pts

That's great for video. But when processing audio, coded_frame->pts is 
consistently junk, always valued -9223372036854775808. So my question is after 
encoding, if the avcodec_encode_audio2 indeed returns a packet how to access the 
pts for the encoded frame (or in the case of audio, it would probably be more 
proper to say the encoded samples)? Again, given that the packet returned by the 
encoder isn't necessarily the one that was encoded, we need the pts of the packet 
when it was encoded.

How do I access this value?


I can't address everything here, but I can contribute a couple things:

-9223372036854775808 is not technically "junk". It is AV_NOPTS_VALUE, which is a deliberately chosen value with a specific meaning: there is no pts. It is the minimum value of a signed 64-bit integer. Over the years I've learned to recognize -2^31, but -2^63 is a more recent animal... so I have missed this myself ;)

What I've been doing to get timestamps of encoded frames in the face of codec caching is maintain a timestamp queue. You push the pts of your frame onto the queue each time you encode a frame; you only pop a timestamp off if you got an encoded frame back. The first frame out of the codec has to be the first one you put in, and it will match first timestamp popped off the queue. Unless the codec is eating frames, the timestamps you pop off the queue _must_ match the frames you're getting out of the codec. Pseudocode is something like:

tsQueue.push(frame.pts);
encode(frame);
if (got_an_encoded_frame_back)
{
  encoded_frame_pts = tsQueue.pop();
  // do whatever with the encoded frame and its pts
}

The queue will naturally size itself to the latency of the codec, so you can "autodetect" the latency of codecs by checking the size of your timestamp queue. If there is no latency, you'll just be popping off the timestamp immediately after pushing it on, which might be seen as a waste of time, but the technique still works.

Andy


_______________________________________________
Libav-user mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/libav-user

Reply via email to