On Thu, Apr 24, 2025 at 8:10 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > On Thu, Apr 24, 2025 at 6:27 PM Jan Hubicka <hubi...@ucw.cz> wrote: > > > > > Since ix86_expand_sse_movcc will simplify them into a simple vmov, vpand > > > or vpandn. > > > Current register_operand/vector_operand could lose some optimization > > > opportunity. > > > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > > > Ok for trunk? > > > > > > gcc/ChangeLog: > > > > > > * config/i386/predicates.md (vector_or_0_or_1s_operand): New > > > predicate. > > > (nonimm_or_0_or_1s_operand): Ditto. > > > * config/i386/sse.md (vcond_mask_<mode><sseintvecmodelower>): > > > Extend the predicate of operands1 to accept 0 or allones > > > operands. > > > (vcond_mask_<mode><sseintvecmodelower>): Ditto. > > > (vcond_mask_v1tiv1ti): Ditto. > > > (vcond_mask_<mode><sseintvecmodelower>): Ditto. > > > * config/i386/i386.md (mov<mode>cc): Ditto for operands[2] and > > > operands[3]. > > > > > > gcc/testsuite/ChangeLog: > > > > > > * gcc.target/i386/blendv-to-maxmin.c: New test. > > > * gcc.target/i386/blendv-to-pand.c: New test. > > > > > diff --git a/gcc/testsuite/gcc.target/i386/blendv-to-maxmin.c > > > b/gcc/testsuite/gcc.target/i386/blendv-to-maxmin.c > > > new file mode 100644 > > > index 00000000000..042eb7d8f24 > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.target/i386/blendv-to-maxmin.c > > > @@ -0,0 +1,12 @@ > > > +/* { dg-do compile } */ > > > +/* { dg-options "-march=x86-64-v3 -O2 -mfpmath=sse" } */ > > > +/* { dg-final { scan-assembler-times "vmaxsd" 1 } } */ > > > + > > > +double > > > +foo (double a) > > > +{ > > > + if (a > 0.0) > > > + return a; > > > + return 0.0; > > > +} > > > > With -ffast-math this is matched as MAX_EXPR at gimple level. Without > > -ffast-math we can not do that since MAX_EXPR (and RTL SMAX) are > > explicitely documented as unspecified when one of parameters is nan. > > > > So without -ffast-math at combine time we see: > > (insn 6 3 7 2 (set (reg:DF 103) > > (const_double:DF 0.0 [0x0.0p+0])) "e.c":7:1 169 {*movdf_internal} > > (nil)) > > (insn 7 6 12 2 (set (reg:DF 102 [ _2 ]) > > (unspec:DF [ > > (reg:DF 104 [ a ]) > > (reg:DF 103) > > ] UNSPEC_IEEE_MAX)) "e.c":7:1 1825 {*ieee_smaxdf3} > > (expr_list:REG_DEAD (reg:DF 104 [ a ]) > > (expr_list:REG_DEAD (reg:DF 103) > > (nil)))) > > > > maxss is defined as: > > > > MAX(SRC1, SRC2) > > { > > IF ((SRC1 = 0.0) and (SRC2 = 0.0)) THEN DEST := SRC2; > > ELSE IF (SRC1 = NaN) THEN DEST := SRC2; FI; > > ELSE IF (SRC2 = NaN) THEN DEST := SRC2; FI; > > ELSE IF (SRC1 > SRC2) THEN DEST := SRC1; > > ELSE DEST := SRC2; > > FI; > > } > > Please see [1], "Maximum and minimum functions", which says: > > "The maxNum and minNum functions defined in the 2008 standard > propagate a non-NaN when one input is NaN and the other input is a > normal number. > > This problem will be fixed by the forthcoming revision of the > standard. The new functions named maximum and minimum are certain to > propagate NaNs. > Some current implementations are deviating from both of these > definitions. Max and min instructions in the x86 instruction set are > implemented so that max(a,b) and min(a,b) give b if one of the inputs > is NaN. This is useful because it corresponds to the behavior of the > code expression a > b ? a : b. A compiler can translate this common > high-level language expression into a single instruction." > > Unfortunately, SSE max and min instructions are incompatible with both > standard revisions due to "ELSE IF (SRC1 = NaN) THEN DEST := SRC2; > FI;"
Ehm, SSE max and min instructions are incompatible with -2019 because of "ELSE IF (SRC1 = NaN) THEN DEST := SRC2; FI;" and with -2008 because of "ELSE IF (SRC2 = NaN) THEN DEST := SRC2; FI;". Uros.