On Wed, Jun 5, 2024 at 10:38 AM Li, Pan2 <[email protected]> wrote:
>
> > I see. x86 doesn't have scalar saturating instructions, so the scalar
> > version indeed can't be converted.
>
> > I will amend x86 testcases after the vector part of your patch is committed.
>
> Thanks for the confirmation. Just curious, the .SAT_SUB for scalar has sorts
> of forms, like a branch version as below.
>
> .SAT_SUB (x, y) = x > y ? x - y : 0. // or leverage __builtin_sub_overflow
> here
>
> It is reasonable to implement the scalar .SAT_SUB for x86? Given somehow we
> can eliminate the branch here.
x86 will emit cmove in the above case:
movl %edi, %eax
xorl %edx, %edx
subl %esi, %eax
cmpl %edi, %esi
cmovnb %edx, %eax
Maybe we can reuse flags from the subtraction here to avoid the compare.
Uros.