On Thu, Apr 24, 2025 at 6:27 PM Jan Hubicka <hubi...@ucw.cz> wrote:
>
> > Since ix86_expand_sse_movcc will simplify them into a simple vmov, vpand
> > or vpandn.
> > Current register_operand/vector_operand could lose some optimization
> > opportunity.
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> >       * config/i386/predicates.md (vector_or_0_or_1s_operand): New 
> > predicate.
> >       (nonimm_or_0_or_1s_operand): Ditto.
> >       * config/i386/sse.md (vcond_mask_<mode><sseintvecmodelower>):
> >       Extend the predicate of operands1 to accept 0 or allones
> >       operands.
> >       (vcond_mask_<mode><sseintvecmodelower>): Ditto.
> >       (vcond_mask_v1tiv1ti): Ditto.
> >       (vcond_mask_<mode><sseintvecmodelower>): Ditto.
> >       * config/i386/i386.md (mov<mode>cc): Ditto for operands[2] and
> >       operands[3].
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.target/i386/blendv-to-maxmin.c: New test.
> >       * gcc.target/i386/blendv-to-pand.c: New test.
>
> > diff --git a/gcc/testsuite/gcc.target/i386/blendv-to-maxmin.c 
> > b/gcc/testsuite/gcc.target/i386/blendv-to-maxmin.c
> > new file mode 100644
> > index 00000000000..042eb7d8f24
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/blendv-to-maxmin.c
> > @@ -0,0 +1,12 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=x86-64-v3 -O2 -mfpmath=sse" } */
> > +/* { dg-final { scan-assembler-times "vmaxsd" 1 } } */
> > +
> > +double
> > +foo (double a)
> > +{
> > +  if (a > 0.0)
> > +    return a;
> > +  return 0.0;
> > +}
>
> With -ffast-math this is matched as MAX_EXPR at gimple level. Without
> -ffast-math we can not do that since MAX_EXPR (and RTL SMAX) are
> explicitely documented as unspecified when one of parameters is nan.
>
> So without -ffast-math at combine time we see:
> (insn 6 3 7 2 (set (reg:DF 103)
>         (const_double:DF 0.0 [0x0.0p+0])) "e.c":7:1 169 {*movdf_internal}
>      (nil))
> (insn 7 6 12 2 (set (reg:DF 102 [ _2 ])
>         (unspec:DF [
>                 (reg:DF 104 [ a ])
>                 (reg:DF 103)
>             ] UNSPEC_IEEE_MAX)) "e.c":7:1 1825 {*ieee_smaxdf3}
>      (expr_list:REG_DEAD (reg:DF 104 [ a ])
>         (expr_list:REG_DEAD (reg:DF 103)
>             (nil))))
>
> maxss is defined as:
>
> MAX(SRC1, SRC2)
> {
>     IF ((SRC1 = 0.0) and (SRC2 = 0.0)) THEN DEST := SRC2;
>         ELSE IF (SRC1 = NaN) THEN DEST := SRC2; FI;
>         ELSE IF (SRC2 = NaN) THEN DEST := SRC2; FI;
>         ELSE IF (SRC1 > SRC2) THEN DEST := SRC1;
>         ELSE DEST := SRC2;
>     FI;
> }

Please see [1], "Maximum and minimum functions", which says:

"The maxNum and minNum functions defined in the 2008 standard
propagate a non-NaN when one input is NaN and the other input is a
normal number.

This problem will be fixed by the forthcoming revision of the
standard. The new functions named maximum and minimum are certain to
propagate NaNs.
Some current implementations are deviating from both of these
definitions. Max and min instructions in the x86 instruction set are
implemented so that max(a,b) and min(a,b) give b if one of the inputs
is NaN. This is useful because it corresponds to the behavior of the
code expression a > b ? a : b. A compiler can translate this common
high-level language expression into a single instruction."

Unfortunately, SSE max and min instructions are incompatible with both
standard revisions due to "ELSE IF (SRC1 = NaN) THEN DEST := SRC2;
FI;"

[1] 
https://grouper.ieee.org/groups/msc/ANSI_IEEE-Std-754-2019/background/nan-propagation.pdf

Uros.

Reply via email to