https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98211
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- Huh. OK, so we do some pointless vectorization (the store is in a BB ending in __builtin_unreachable()) but the actual issue must be the live lane extraction into the not vectorized scalar code: vect_patt_95.26_71 = .VCOND_MASK (mask_patt_91.25_47, _59, _67); _79 = BIT_FIELD_REF <vect_patt_95.26_71, 16, 0>; hmm, somehow the VCOND_MASK condition unpacking ends up with v16_int8 = {1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0} Looks like this is because we have _17 = var_12_18(D) != 0; _11 = test_var_3.4_1 != 0; _26 = _11 | _17; _99 = VIEW_CONVERT_EXPR<unsigned char>(_26); _35 = {_99, _99, _99, _99, _99, _99, _99, _99, _99, _99, _99, _99, _99, _99, _99, _99}; mask_patt_87.23_39 = VIEW_CONVERT_EXPR<vector(16) <signed-boolean:8>>(_35); mask_patt_91.25_47 = [vec_unpack_lo_expr] mask_patt_87.23_39; but that doesn't produce the canonical -1 values which means the bool pattern is somehow broke. We do code-generate mask_patt_87.23_39 = VIEW_CONVERT_EXPR<vector(16) <signed-boolean:8>>(_35); mask_patt_84.24_43 = mask_patt_87.23_39 ^ { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; mask_patt_91.25_47 = [vec_unpack_lo_expr] mask_patt_84.24_43; from the (gdb) p debug (slp_node) x.c:37:6: note: node 0x3f7e078 (max_nunits=16, refcnt=1) x.c:37:6: note: op template: patt_84 = patt_87 != 0; x.c:37:6: note: stmt 0 patt_84 = patt_87 != 0; x.c:37:6: note: stmt 1 patt_84 = patt_87 != 0; ... node by choosing BIT_XOR. patt_87 has boolean vector type, but that is just a cast: x.c:41:31: note: op template: patt_87 = (<signed-boolean:8>) _26; x.c:41:31: note: stmt 0 patt_87 = (<signed-boolean:8>) _26; x.c:41:31: note: stmt 1 patt_87 = (<signed-boolean:8>) _26; x.c:41:31: note: stmt 2 patt_87 = (<signed-boolean:8>) _26; x.c:41:31: note: stmt 3 patt_87 = (<signed-boolean:8>) _26; which I guess is what is wrong, built via #0 0x0000000002a217c0 in build_mask_conversion (vinfo=0x3e3e510, mask=<ssa_name 0x7ffff69a3d80 26>, vectype=<vector_type 0x7ffff699ad20>, stmt_vinfo=0x3e64490) at /home/rguenther/src/gcc2/gcc/tree-vect-patterns.c:4230 #1 0x0000000002a22497 in vect_recog_mask_conversion_pattern (vinfo=0x3e3e510, stmt_vinfo=0x3e64490, type_out=0x7fffffffd060) at /home/rguenther/src/gcc2/gcc/tree-vect-patterns.c:4457 #2 0x0000000002a25505 in vect_pattern_recog_1 (vinfo=0x3e3e510, recog_func=0x3b21050 <vect_vect_recog_func_ptrs+272>, stmt_info=0x3e64490) at /home/rguenther/src/gcc2/gcc/tree-vect-patterns.c:5450 #3 0x0000000002a25a1f in vect_pattern_recog (vinfo=0x3e3e510) at /home/rguenther/src/gcc2/gcc/tree-vect-patterns.c:5608 which is /* If rhs1 is a comparison we need to move it into a separate statement. */ if (TREE_CODE (rhs1) != SSA_NAME) { tmp = vect_recog_temp_ssa_var (TREE_TYPE (rhs1), NULL); if (rhs1_op0_type && TYPE_PRECISION (rhs1_op0_type) != TYPE_PRECISION (rhs1_type)) rhs1_op0 = build_mask_conversion (vinfo, rhs1_op0, vectype2, stmt_vinfo); if (rhs1_op1_type && TYPE_PRECISION (rhs1_op1_type) != TYPE_PRECISION (rhs1_type)) rhs1_op1 = build_mask_conversion (vinfo, rhs1_op1, vectype2, stmt_vinfo); pattern_stmt = gimple_build_assign (tmp, TREE_CODE (rhs1), rhs1_op0, rhs1_op1); rhs1 = tmp; append_pattern_def_seq (vinfo, stmt_vinfo, pattern_stmt, vectype2, rhs1_type); } x.c:41:31: note: === vect_determine_precisions === x.c:41:31: note: using normal nonmask vectors for _17 = var_12_18(D) != 0; x.c:41:31: note: using boolean precision 32 for _11 = test_var_3.4_1 != 0; x.c:41:31: note: using boolean precision 32 for _26 = _11 | _17; ... x.c:41:31: note: === vect_pattern_recog === x.c:41:31: note: vect_recog_mask_conversion_pattern: detected: iftmp.2_10 = _26 != 0 ? iftmp.2_22 : iftmp.2_21; x.c:41:31: note: mask_conversion pattern recognized: patt_95 = patt_91 ? iftmp.2_22 : iftmp.2_21; x.c:41:31: note: extra pattern stmt: patt_87 = (<signed-boolean:8>) _26; x.c:41:31: note: extra pattern stmt: patt_84 = patt_87 != 0; x.c:41:31: note: extra pattern stmt: patt_91 = (<signed-boolean:16>) patt_84; note _26 is not part of the SLP but is splat from the scalar def. As SLP improvement it's to say we should have splat iftmp.2_10 itself but the change probably disabled that. Still the above is a latent issue - I'll try to craft a more meaningful testcase.