https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98211

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Huh.  OK, so we do some pointless vectorization (the store is in a BB
ending in __builtin_unreachable()) but the actual issue must be the
live lane extraction into the not vectorized scalar code:

  vect_patt_95.26_71 = .VCOND_MASK (mask_patt_91.25_47, _59, _67);
  _79 = BIT_FIELD_REF <vect_patt_95.26_71, 16, 0>;

hmm, somehow the VCOND_MASK condition unpacking ends up with
v16_int8 = {1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0}

Looks like this is because we have

  _17 = var_12_18(D) != 0;
  _11 = test_var_3.4_1 != 0;
  _26 = _11 | _17;
  _99 = VIEW_CONVERT_EXPR<unsigned char>(_26);
  _35 = {_99, _99, _99, _99, _99, _99, _99, _99, _99, _99, _99, _99, _99, _99,
_99, _99};
  mask_patt_87.23_39 = VIEW_CONVERT_EXPR<vector(16) <signed-boolean:8>>(_35);
  mask_patt_91.25_47 = [vec_unpack_lo_expr] mask_patt_87.23_39;

but that doesn't produce the canonical -1 values which means the bool
pattern is somehow broke.  We do code-generate

  mask_patt_87.23_39 = VIEW_CONVERT_EXPR<vector(16) <signed-boolean:8>>(_35);
  mask_patt_84.24_43 = mask_patt_87.23_39 ^ { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0 };
  mask_patt_91.25_47 = [vec_unpack_lo_expr] mask_patt_84.24_43;

from the

(gdb) p debug (slp_node)
x.c:37:6: note: node 0x3f7e078 (max_nunits=16, refcnt=1)
x.c:37:6: note: op template: patt_84 = patt_87 != 0;
x.c:37:6: note:         stmt 0 patt_84 = patt_87 != 0;
x.c:37:6: note:         stmt 1 patt_84 = patt_87 != 0;
...

node by choosing BIT_XOR.  patt_87 has boolean vector type, but that
is just a cast:

x.c:41:31: note:   op template: patt_87 = (<signed-boolean:8>) _26;
x.c:41:31: note:        stmt 0 patt_87 = (<signed-boolean:8>) _26;
x.c:41:31: note:        stmt 1 patt_87 = (<signed-boolean:8>) _26;
x.c:41:31: note:        stmt 2 patt_87 = (<signed-boolean:8>) _26;
x.c:41:31: note:        stmt 3 patt_87 = (<signed-boolean:8>) _26;

which I guess is what is wrong, built via

#0  0x0000000002a217c0 in build_mask_conversion (vinfo=0x3e3e510,
mask=<ssa_name 0x7ffff69a3d80 26>, 
    vectype=<vector_type 0x7ffff699ad20>, stmt_vinfo=0x3e64490)
    at /home/rguenther/src/gcc2/gcc/tree-vect-patterns.c:4230
#1  0x0000000002a22497 in vect_recog_mask_conversion_pattern (vinfo=0x3e3e510,
stmt_vinfo=0x3e64490, 
    type_out=0x7fffffffd060) at
/home/rguenther/src/gcc2/gcc/tree-vect-patterns.c:4457
#2  0x0000000002a25505 in vect_pattern_recog_1 (vinfo=0x3e3e510, 
    recog_func=0x3b21050 <vect_vect_recog_func_ptrs+272>, stmt_info=0x3e64490)
    at /home/rguenther/src/gcc2/gcc/tree-vect-patterns.c:5450
#3  0x0000000002a25a1f in vect_pattern_recog (vinfo=0x3e3e510)
    at /home/rguenther/src/gcc2/gcc/tree-vect-patterns.c:5608

which is

      /* If rhs1 is a comparison we need to move it into a
         separate statement.  */
      if (TREE_CODE (rhs1) != SSA_NAME)
        {
          tmp = vect_recog_temp_ssa_var (TREE_TYPE (rhs1), NULL);
          if (rhs1_op0_type
              && TYPE_PRECISION (rhs1_op0_type) != TYPE_PRECISION (rhs1_type))
            rhs1_op0 = build_mask_conversion (vinfo, rhs1_op0,
                                              vectype2, stmt_vinfo);
          if (rhs1_op1_type
              && TYPE_PRECISION (rhs1_op1_type) != TYPE_PRECISION (rhs1_type))
            rhs1_op1 = build_mask_conversion (vinfo, rhs1_op1,
                                      vectype2, stmt_vinfo);
          pattern_stmt = gimple_build_assign (tmp, TREE_CODE (rhs1),
                                              rhs1_op0, rhs1_op1);
          rhs1 = tmp;
          append_pattern_def_seq (vinfo, stmt_vinfo, pattern_stmt, vectype2,
                                  rhs1_type);
        }

x.c:41:31: note:   === vect_determine_precisions ===
x.c:41:31: note:   using normal nonmask vectors for _17 = var_12_18(D) != 0;
x.c:41:31: note:   using boolean precision 32 for _11 = test_var_3.4_1 != 0;
x.c:41:31: note:   using boolean precision 32 for _26 = _11 | _17;
...
x.c:41:31: note:   === vect_pattern_recog ===
x.c:41:31: note:   vect_recog_mask_conversion_pattern: detected: iftmp.2_10 =
_26 != 0 ? iftmp.2_22 : iftmp.2_21;
x.c:41:31: note:   mask_conversion pattern recognized: patt_95 = patt_91 ?
iftmp.2_22 : iftmp.2_21;
x.c:41:31: note:   extra pattern stmt: patt_87 = (<signed-boolean:8>) _26;
x.c:41:31: note:   extra pattern stmt: patt_84 = patt_87 != 0;
x.c:41:31: note:   extra pattern stmt: patt_91 = (<signed-boolean:16>) patt_84;

note _26 is not part of the SLP but is splat from the scalar def.

As SLP improvement it's to say we should have splat iftmp.2_10 itself but
the change probably disabled that.

Still the above is a latent issue - I'll try to craft a more meaningful
testcase.

Reply via email to