On Tue, Feb 27, 2024 at 3:44 PM Richard Biener <rguent...@suse.de> wrote: > > On Tue, 27 Feb 2024, haochen.jiang wrote: > > > On Linux/x86_64, > > > > af66ad89e8169f44db723813662917cf4cbb78fc is the first bad commit > > commit af66ad89e8169f44db723813662917cf4cbb78fc > > Author: Richard Biener <rguent...@suse.de> > > Date: Fri Feb 23 16:06:05 2024 +0100 > > > > middle-end/114070 - folding breaking VEC_COND expansion > > > > caused > > > > FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" > > This shows that the x86 backend is missing vcond_mask_qiqi and friends Interesting, so both operand and mask are vector boolean. > (for AVX512 mask modes). Either that or both expand_vec_cond_expr_p > and all the machinery behind it (ISEL pass, lowering) should handle > pure integer mode VEC_COND_EXPR via bit operations. I think quite some > targets now implement patterns for these variants, whatever their > boolean vector modes are. > > One complication with the change, which was > > (simplify > (op @3 (vec_cond:s @0 @1 @2)) > - (vec_cond @0 (op! @3 @1) (op! @3 @2)))) > + (if (TREE_CODE_CLASS (op) != tcc_comparison > + || types_match (type, TREE_TYPE (@1)) > + || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)) > + (vec_cond @0 (op! @3 @1) (op! @3 @2))))) > > is that expand_vec_cond_expr_p can also handle comparison defined > masks, but whether or not we have this isn't visible here so we > can only check whether vcond_mask expansion would work. > > We have optimize_vectors_before_lowering_p but we shouldn't even there > turn supported into not supported ops and as said, what's supported or > not cannot be finally decided (if it's only vcond and not vcond_mask > that is supported). Also optimize_vectors_before_lowering_p is set > for a short time between vectorization and vector lowering and we > definitely do not want to turn supported vectorizer emitted stmts > into ones that we need to lower. For GCC 15 we should see to move > vector lowering before vectorization (before loop optimization I'd > say) to close this particula hole (and also reliably ICE when the > vectorizer creates unsupported IL). We also definitely want to > retire vcond expanders (no target I know of supports single-instruction > compare-and-select). > > So short term we either live with this regression (the testcase > verifies we perform constant folding to { 0, 0 }), implement > the four missing patterns (qi, hi, si and di missing value mode > vcond_mask patterns) or see to implement generic code for this. > > Given precedent I'd tend towards adding the x86 patterns. > > Hongtao, can you handle that? Sure, I'll take a look. > > Thanks, > Richard.
-- BR, Hongtao