Michael Niedermayer: > On Mon, Apr 08, 2024 at 10:13:40PM +0200, Andreas Rheinhardt wrote: >> Frame-threaded decoders with inter-frame dependencies >> use the ThreadFrame API for syncing. It works as follows: >> >> During init each thread allocates an AVFrame for every >> ThreadFrame. >> >> Thread A reads the header of its packet and allocates >> a buffer for an AVFrame with ff_thread_get_ext_buffer() >> (which also allocates a small structure that is shared >> with other references to this frame) and sets its fields, >> including side data. Then said thread calls ff_thread_finish_setup(). >> From that moment onward it is not allowed to change any >> of the AVFrame fields at all any more, but it may change >> fields which are an indirection away, like the content of >> AVFrame.data or already existing side data. >> >> After thread A has called ff_thread_finish_setup(), >> another thread (the user one) calls the codec's update_thread_context >> callback which in turn calls ff_thread_ref_frame() which >> calls av_frame_ref() which reads every field of A's >> AVFrame; hence the above restriction on modifications >> of the AVFrame (as any modification of the AVFrame by A after >> ff_thread_finish_setup() would be a data race). Of course, >> this av_frame_ref() also incurs allocations and therefore >> needs to be checked. ff_thread_ref_frame() also references >> the small structure used for communicating progress. >> >> This av_frame_ref() makes it awkward to propagate values that >> only become known during decoding to later threads (in case of >> frame reordering or other mechanisms of delayed output (like >> show-existing-frames) it's not the decoding thread, but a later >> thread that returns the AVFrame). E.g. for VP9 when exporting video >> encoding parameters as side data the number of blocks only >> becomes known during decoding, so one can't allocate the side data >> before ff_thread_finish_setup(). It is currently being done afterwards >> and this leads to a data race in the vp9-encparams test when using >> frame-threading. Returning decode_error_flags is also complicated >> by this. >> >> To perform this exchange a buffer shared between the references >> is needed (notice that simply giving the later threads a pointer >> to the original AVFrame does not work, because said AVFrame will >> be reused lateron when thread A decodes the next packet given to it). >> One could extend the buffer already used for progress for this >> or use a new one (requiring yet another allocation), yet both >> of these approaches have the drawback of being unnatural, ugly >> and requiring quite a lot of ad-hoc code. E.g. in case of the VP9 >> side data mentioned above one could not simply use the helper >> that allocates and adds the side data to an AVFrame in one go. >> >> The ProgressFrame API meanwhile offers a different solution to all >> of this. It is based around the idea that the most natural >> shared object for sharing information about an AVFrame between >> decoding threads is the AVFrame itself. To actually implement this >> the AVFrame needs to be reference counted. This is achieved by >> putting a (ownership) pointer into a shared (and opaque) structure >> that is managed by the RefStruct API and which also contains >> the stuff necessary for progress reporting. >> The users get a pointer to this AVFrame with the understanding >> that the owner may set all the fields until it has indicated >> that it has finished decoding this AVFrame; then the users are >> allowed to read everything. Every decoder may of course employ >> a different contract than the one outlined above. >> >> Given that there is no underlying av_frame_ref(), creating >> references to a ProgressFrame can't fail. Only >> ff_thread_progress_get_buffer() can fail, but given that >> it will replace calls to ff_thread_get_ext_buffer() it is >> at places where errors are already expected and properly >> taken care of. >> >> The ProgressFrames are empty (i.e. the AVFrame pointer is NULL >> and the AVFrames are not allocated during init at all) >> while not being in use; ff_thread_progress_get_buffer() both >> sets up the actual ProgressFrame and already calls >> ff_thread_get_buffer(). So instead of checking for >> ThreadFrame.f->data[0] or ThreadFrame.f->buf[0] being NULL >> for "this reference frame is non-existing" one should check for >> ProgressFrame.f. >> >> This also implies that one can only set AVFrame properties >> after having allocated the buffer. This restriction is not deep: >> if it becomes onerous for any codec, ff_thread_progress_get_buffer() >> can be broken up. The user would then have to get a buffer >> himself. >> >> In order to avoid unnecessary allocations, the shared structure >> is pooled, so that both the structure as well as the AVFrame >> itself are reused. This means that there won't be lots of >> unnecessary allocations in case of non-frame-threaded decoding. >> It might even turn out to have fewer than the current code >> (the current code allocates AVFrames for every DPB slot, but >> these are often excessively large and not completely used; >> the new code allocates them on demand). Pooling relies on the >> reset function of the RefStruct pool API, it would be impossible >> to implement with the AVBufferPool API. >> >> Finally, ProgressFrames have no notion of owner; they are built >> on top of the ThreadProgress API which also lacks such a concept. >> Instead every ThreadProgress and every ProgressFrame contains >> its own mutex and condition variable, making it completely independent >> of pthread_frame.c. Just like the ThreadFrame API it is simply >> presumed that only the actual owner/producer of a frame reports >> progress on said frame. >> >> Signed-off-by: Andreas Rheinhardt <[email protected]> >> --- >> libavcodec/Makefile | 1 + >> libavcodec/avcodec.c | 1 + >> libavcodec/codec_internal.h | 4 ++ >> libavcodec/decode.c | 122 +++++++++++++++++++++++++++++++++ >> libavcodec/internal.h | 2 + >> libavcodec/progressframe.h | 133 ++++++++++++++++++++++++++++++++++++ >> libavcodec/pthread_frame.c | 1 + >> libavcodec/tests/avcodec.c | 3 +- >> 8 files changed, 266 insertions(+), 1 deletion(-) >> create mode 100644 libavcodec/progressframe.h > > breaks build without threads > > ./configure --disable-pthreads --cc='ccache gcc' && make -j32 > ... > libavcodec/threadprogress.c: In function ‘ff_thread_progress_report’: > libavcodec/threadprogress.c:54:5: error: implicit declaration of function > ‘pthread_mutex_lock’; did you mean ‘ff_mutex_lock’? > [-Werror=implicit-function-declaration] > pthread_mutex_lock(&pro->progress_mutex); > ^~~~~~~~~~~~~~~~~~ > ff_mutex_lock > libavcodec/threadprogress.c:55:5: error: implicit declaration of function > ‘pthread_cond_broadcast’; did you mean ‘ff_cond_broadcast’? > [-Werror=implicit-function-declaration] > pthread_cond_broadcast(&pro->progress_cond); > ^~~~~~~~~~~~~~~~~~~~~~ > ff_cond_broadcast > libavcodec/threadprogress.c:56:5: error: implicit declaration of function > ‘pthread_mutex_unlock’; did you mean ‘ff_mutex_unlock’? > [-Werror=implicit-function-declaration] > pthread_mutex_unlock(&pro->progress_mutex); > ^~~~~~~~~~~~~~~~~~~~ > ff_mutex_unlock > libavcodec/threadprogress.c: In function ‘ff_thread_progress_await’: > libavcodec/threadprogress.c:71:9: error: implicit declaration of function > ‘pthread_cond_wait’; did you mean ‘ff_cond_wait’? > [-Werror=implicit-function-declaration] > pthread_cond_wait(&pro->progress_cond, &pro->progress_mutex); > ^~~~~~~~~~~~~~~~~ > ff_cond_wait > cc1: some warnings being treated as errors > ffbuild/common.mak:81: recipe for target 'libavcodec/threadprogress.o' failed > make: *** [libavcodec/threadprogress.o] Error 1 > make: *** Waiting for unfinished jobs.... >
Thanks for testing; fixed by v3 of patch #1. - Andreas _______________________________________________ ffmpeg-devel mailing list [email protected] https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email [email protected] with subject "unsubscribe".
