Hi Mike,
The source code changes are missing.
Regards,
Surya
On 22/05/25 10:46 am, Michael Meissner wrote:
> Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.
>
> This is version 3 of patch. I re-implemented the patch to just focus on the
> generation of the XSCMP{EQ,GT,GE}{DP,QP} instructions.
>
> In bug PR target/118541 on power9, power10, and power11 systems, for the
> function:
>
> extern double __ieee754_acos (double);
>
> double
> __acospi (double x)
> {
> double ret = __ieee754_acos (x) / 3.14;
> return __builtin_isgreater (ret, 1.0) ? 1.0 : ret;
> }
>
> GCC currently generates the following code:
>
> Power9 Power10 and Power11
> ====== ===================
> bl __ieee754_acos bl __ieee754_acos@notoc
> nop plfd 0,.LC0@pcrel
> addis 9,2,.LC2@toc@ha xxspltidp 12,1065353216
> addi 1,1,32 addi 1,1,32
> lfd 0,.LC2@toc@l(9) ld 0,16(1)
> addis 9,2,.LC0@toc@ha fdiv 0,1,0
> ld 0,16(1) mtlr 0
> lfd 12,.LC0@toc@l(9) xscmpgtdp 1,0,12
> fdiv 0,1,0 xxsel 1,0,12,1
> mtlr 0 blr
> xscmpgtdp 1,0,12
> xxsel 1,0,12,1
> blr
>
> This is because ifcvt.c optimizes the conditional floating point move to use
> the
> XSCMPGTDP instruction.
>
> However, the XSCMPGTDP instruction will generate an interrupt if one of the
> arguments is a signalling NaN and signalling NaNs can generate an interrupt.
> The IEEE comparison functions (isgreater, etc.) require that the comparison
> not
> raise an interrupt.
>
> The root cause of this is we allow floating point comparisons to be reversed
> (i.e. LT will be reversed to UNGE). Before power9, this was ok because we
> only
> generated the FCMPU or XSCMPUDP instructions.
>
> But with power9, we can generate the XSCMPEQDP, XSCMPGTDP, or XSCMPGEDP
> instructions. This code now does not convert an unordered compare into an
> ordered compare. Instead, it does the opposite comparison and swaps the
> arguments. I.e. it converts:
>
> r = (a < b) ? c : d;
>
> into:
>
> r = (b >= a) ? c : d;
>
> For the following code:
>
> double
> ordered_compare (double a, double b, double c, double d)
> {
> return __builtin_isgreater (a, b) ? c : d;
> }
>
> /* Verify normal > does generate xscmpgtdp. */
>
> double
> normal_compare (double a, double b, double c, double d)
> {
> return a > b ? c : d;
> }
>
> with the following patch, GCC generates the following for power9, power10, and
> power11:
>
> ordered_compare:
> fcmpu 0,1,2
> fmr 1,4
> bnglr 0
> fmr 1,3
> blr
>
> normal_compare:
> xscmpgtdp 1,1,2
> xxsel 1,4,3,1
> blr
>
> I have built bootstrap compilers on big endian power9 systems and little
> endian
> power9/power10 systems and there were no regressions. Can I check this patch
> into the GCC trunk, and after a waiting period, can I check this into the
> active
> older branches?
>
> 2025-05-21 Michael Meissner <[email protected]>
>
> gcc/
>
> PR target/118541
> * config/rs6000/predicates.md (invert_fpmask_comparison_operator):
> Delete.
> (fpmask_reverse_args_comparison_operator): New predicate.
> * config/rs6000/rs6000-proto.h (rs6000_fpmask_reverse_args): New
> declaration.
> * config/rs6000/rs6000.cc (rs6000_fpmask_reverse_args): New function.
> * config/rs6000/rs6000.h (REVERSIBLE_CC_MODE): Do not allow floating
> point comparisons to be reversed unless -ffinite-math-only is used.
> * config/rs6000/rs6000.md (mov<SFDF:mode><SFDF2:mode>cc_p9): Add
> comment.
> (mov<SFDF:mode><SFDF2:mode>cc_invert_p9): Reverse the argument order for
> the comparison, and use an unordered comparison, instead of ordered
> comparison.
> (mov<mode>cc_invert_p10): Likewise.
>
> gcc/testsuite/
>
> PR target/118541
> * gcc.target/powerpc/pr118541.c: New test.
>