https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101756
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
So we have
of_14 = of.3[iter.9_16];
ea_13 = ea.2[iter.9_16];
kk_3 = kk.4[iter.9_16];
_32 = kk_3 != 0;
_33 = ea_13 != -1;
_43 = ea_13 != -2;
_36 = MAX_EXPR <_33, _43>;
_41 = _32 < _36;
_31 = (int) _41;
of_46 = of_14 | _31;
retval.1[iter.9_16] = of_46;
and try to SLP vectorize
_33 = ea_13 != -1;
_43 = ea_13 != -2;
_36 = MAX_EXPR <_33, _43>;
where pattern recog said
t.i:2:1: note: === vect_determine_precisions ===
t.i:2:1: note: using boolean precision 32 for _32 = kk_3 != 0;
t.i:2:1: note: using boolean precision 64 for _33 = ea_13 != -1;
t.i:2:1: note: using boolean precision 64 for _43 = ea_13 != -2;
t.i:2:1: note: using boolean precision 32 for _41 = _32 < _36;
t.i:2:1: note: ivtmp_35 has no range info
t.i:2:1: note: iter.29_17 has range [0x1, 0x4]
t.i:2:1: note: can narrow to unsigned:3 without loss of precision: iter.29_17
= iter.29_16 + 1;
t.i:2:1: note: of_46 has no range info
t.i:2:1: note: _31 has no range info
t.i:2:1: note: === vect_pattern_recog ===
t.i:2:1: note: vect_recog_bool_pattern: detected: _31 = (int) _41;
t.i:2:1: note: bool pattern recognized: patt_45 = _41 ? 1 : 0;
where possible_vector_mask_operation_p doesn't include MAX_EXPR. The
MAX_EXPR is introduced late by forwprop:
@@ -189,8 +203,10 @@
_43 = ea_13 != -2;
_44 = _32 < _43;
_45 = (int) _44;
- _31 = _35 | _45;
- of_46 = _31 | of_14;
+ _36 = MAX_EXPR <_33, _43>;
+ _41 = _32 < _36;
+ _31 = (int) _41;
+ of_46 = of_14 | _31;
retval.21[iter.29_16] = of_46;
iter.29_17 = iter.29_16 + 1;
if (iter.29_17 != 4)
this is done by
/* Transform (@0 < @1 and @0 < @2) to use min,
(@0 > @1 and @0 > @2) to use max */
(for logic (bit_and bit_and bit_and bit_and bit_ior bit_ior bit_ior bit_ior)
op (lt le gt ge lt le gt ge )
ext (min min max max max max min min )
(simplify
(logic (op:cs @0 @1) (op:cs @0 @2))
(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
&& TREE_CODE (@0) != INTEGER_CST)
(op @0 (ext @1 @2)))))
and formerly this is simplified from (int)kk!=0 < (int)ea!=0 to
a compare of booleans by
/* From fold_sign_changed_comparison and fold_widened_comparison.
FIXME: the lack of symmetry is disturbing. */
(for cmp (simple_comparison)
(simplify
(cmp (convert@0 @00) (convert?@1 @10))
(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
The vectorization "breaks" since we're doing
/* ??? Support other schemes than direct internal fn. */
internal_fn reduc_fn;
if (!reduction_fn_for_scalar_code (reduc_code, &reduc_fn)
|| reduc_fn == IFN_LAST)
gcc_unreachable ();
tree scalar_def = gimple_build (&epilogue, as_combined_fn (reduc_fn),
TREE_TYPE (TREE_TYPE (vec_def)),
vec_def);
but then since vector bools are signed and the original bool is unsigned
MAX isn't "correct" here anyway.