Le keskiviikkona 12. kesäkuuta 2024, 20.40.37 EEST James Almer a écrit :
> On 6/12/2024 1:47 AM, Rémi Denis-Courmont wrote:
> > Note that optimised implementations of these functions will be taken
> > into actual use only if MpegEncContext.dct_unquantize_h263_{inter,intra}
> > are *not* overloaded by existing optimisations.
> >
> > ---
> > This adds the plus ones back, saving two branch instructions in C and
> > one in assembler (at the cost of two unconditional adds).
>
> See my reply in the previous version. Not sure if it will help with this.
We can of course avoid the branches - this version avoids the branches, as did
the initial versions. In C (and in RVV), we can't avoid incrementing the
pointer and a counter variable.
If you change the loop like yuo suggest:
for (size_t i = 1; i <= nCoeffs; i++) {
int level = block[i];
if (level) {
if (level < 0)
level = level * qmul - qadd;
else
level = level * qmul + qadd;
block[i] = level;
}
}
... at best, an optimising compiler will reinterpret it to:
if (nCoeffs >= 1) {
block++;
end = block + nCoeffs;
loop:
level = *block;
if (level) {
tmp = level * qmul;
if (level < 0)
tmp -= qadd;
else
tmp += qadd;
*(block++) = tmp;
}
if (block <= end)
goto loop;
}
Or perhaps the compiler will keep an explicit counter, which is even worse.
This does not save branches, nor increments. It just looks like it because of
the syntactic sugar that is the for() loop. In reality, this only duplicates
code (as we can no longer share between inter/intra).
--
レミ・デニ-クールモン
http://www.remlab.net/
_______________________________________________
ffmpeg-devel mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".