Oct 7, 2023, 17:08 by [email protected]: > Removes the clever subgroup parallel prefix computation, > and instead just computes the prefix inline. > Cuts down the number of dispatches by a huge amount. > > Provides a ~12x speedup (2.5fps to 30fps on a 7900XTX, > 2.1fps to 24fps on an Ada). > > Patch attached. >
Going to push the patchset a bit later today. _______________________________________________ ffmpeg-devel mailing list [email protected] https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email [email protected] with subject "unsubscribe".
