https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101756

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
So we have

  of_14 = of.3[iter.9_16];
  ea_13 = ea.2[iter.9_16];
  kk_3 = kk.4[iter.9_16];
  _32 = kk_3 != 0;
  _33 = ea_13 != -1;
  _43 = ea_13 != -2;
  _36 = MAX_EXPR <_33, _43>;
  _41 = _32 < _36;
  _31 = (int) _41;
  of_46 = of_14 | _31;
  retval.1[iter.9_16] = of_46;

and try to SLP vectorize

  _33 = ea_13 != -1;
  _43 = ea_13 != -2;
  _36 = MAX_EXPR <_33, _43>;

where pattern recog said

t.i:2:1: note:   === vect_determine_precisions ===
t.i:2:1: note:   using boolean precision 32 for _32 = kk_3 != 0;
t.i:2:1: note:   using boolean precision 64 for _33 = ea_13 != -1;
t.i:2:1: note:   using boolean precision 64 for _43 = ea_13 != -2;
t.i:2:1: note:   using boolean precision 32 for _41 = _32 < _36;
t.i:2:1: note:   ivtmp_35 has no range info
t.i:2:1: note:   iter.29_17 has range [0x1, 0x4]
t.i:2:1: note:   can narrow to unsigned:3 without loss of precision: iter.29_17
= iter.29_16 + 1;
t.i:2:1: note:   of_46 has no range info
t.i:2:1: note:   _31 has no range info
t.i:2:1: note:   === vect_pattern_recog ===
t.i:2:1: note:   vect_recog_bool_pattern: detected: _31 = (int) _41;
t.i:2:1: note:   bool pattern recognized: patt_45 = _41 ? 1 : 0;

where possible_vector_mask_operation_p doesn't include MAX_EXPR.  The
MAX_EXPR is introduced late by forwprop:

@@ -189,8 +203,10 @@
   _43 = ea_13 != -2;
   _44 = _32 < _43;
   _45 = (int) _44;
-  _31 = _35 | _45;
-  of_46 = _31 | of_14;
+  _36 = MAX_EXPR <_33, _43>;
+  _41 = _32 < _36;
+  _31 = (int) _41;
+  of_46 = of_14 | _31;
   retval.21[iter.29_16] = of_46;
   iter.29_17 = iter.29_16 + 1;
   if (iter.29_17 != 4)

this is done by

/* Transform (@0 < @1 and @0 < @2) to use min,
   (@0 > @1 and @0 > @2) to use max */
(for logic (bit_and bit_and bit_and bit_and bit_ior bit_ior bit_ior bit_ior)
     op    (lt      le      gt      ge      lt      le      gt      ge     )
     ext   (min     min     max     max     max     max     min     min    )
 (simplify
  (logic (op:cs @0 @1) (op:cs @0 @2))
  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
       && TREE_CODE (@0) != INTEGER_CST)
   (op @0 (ext @1 @2)))))

and formerly this is simplified from (int)kk!=0 < (int)ea!=0 to
a compare of booleans by

/* From fold_sign_changed_comparison and fold_widened_comparison.
   FIXME: the lack of symmetry is disturbing.  */
(for cmp (simple_comparison)
 (simplify
  (cmp (convert@0 @00) (convert?@1 @10))
  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))


The vectorization "breaks" since we're doing

      /* ???  Support other schemes than direct internal fn.  */
      internal_fn reduc_fn;
      if (!reduction_fn_for_scalar_code (reduc_code, &reduc_fn)
          || reduc_fn == IFN_LAST)
        gcc_unreachable ();
      tree scalar_def = gimple_build (&epilogue, as_combined_fn (reduc_fn),
                                      TREE_TYPE (TREE_TYPE (vec_def)),
vec_def);

but then since vector bools are signed and the original bool is unsigned
MAX isn't "correct" here anyway.

Reply via email to