https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71921
Hongtao Liu <liuhongt at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |liuhongt at gcc dot gnu.org
--- Comment #25 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #24)
> The testcase in the description works with GCC 15, the testcase in comment
> 16 does not. Neither works with GCC 16 for me.
>
> The x86 backend has ix86_expand_fp_movcc for scalar operations to detect
> min/max, for vector it relies on combine IIRC.
>
> We are not exposing the mask generation upon RTL expansion,
> expand_vec_cond_mask_optab_fn does not try to TER here and
> ix86_expand_sse_movcc
> does
>
> else if (op_false == CONST0_RTX (mode))
> {
> x = expand_simple_binop (mode, AND, cmp, op_true,
> dest, 1, OPTAB_DIRECT);
> if (x != dest)
> emit_move_insn (dest, x);
>
> hereby confusing the existing combiner patterns (in general if this does not
> match to min/max that's of course a good optimization).
I saw the pattern is simplified to
(set (reg:V8SF 124)
(and:V8SF (not:V8SF (lt:V8SF (reg:V8SF 123 [ MEM <const vector(8) float>
[(const float *)input_12(D) + ivtmp.30_4 * 1] ])
(const_vector:V8SF [
(const_double:SF 0.0 [0x0.0p+0]) repeated x8
])))
(reg:V8SF 123 [ MEM <const vector(8) float> [(const float *)input_12(D)
+ ivtmp.30_4 * 1] ])))
which should have same semantics as x86 min/max, and the backend need to
support those variants for it.
BTW the testcase in the description works for GCC trunk with
-march=sapphirerapids/znver5 -O3, but not for -march=x86-64-v3.