On Tue, Feb 27, 2024 at 3:44 PM Richard Biener <rguent...@suse.de> wrote:
>
> On Tue, 27 Feb 2024, haochen.jiang wrote:
>
> > On Linux/x86_64,
> >
> > af66ad89e8169f44db723813662917cf4cbb78fc is the first bad commit
> > commit af66ad89e8169f44db723813662917cf4cbb78fc
> > Author: Richard Biener <rguent...@suse.de>
> > Date:   Fri Feb 23 16:06:05 2024 +0100
> >
> >     middle-end/114070 - folding breaking VEC_COND expansion
> >
> > caused
> >
> > FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr"
>
> This shows that the x86 backend is missing vcond_mask_qiqi and friends
Interesting, so both operand and mask are vector boolean.
> (for AVX512 mask modes).  Either that or both expand_vec_cond_expr_p
> and all the machinery behind it (ISEL pass, lowering) should handle
> pure integer mode VEC_COND_EXPR via bit operations.  I think quite some
> targets now implement patterns for these variants, whatever their
> boolean vector modes are.
>
> One complication with the change, which was
>
>   (simplify
>    (op @3 (vec_cond:s @0 @1 @2))
> -  (vec_cond @0 (op! @3 @1) (op! @3 @2))))
> +  (if (TREE_CODE_CLASS (op) != tcc_comparison
> +       || types_match (type, TREE_TYPE (@1))
> +       || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK))
> +   (vec_cond @0 (op! @3 @1) (op! @3 @2)))))
>
> is that expand_vec_cond_expr_p can also handle comparison defined
> masks, but whether or not we have this isn't visible here so we
> can only check whether vcond_mask expansion would work.
>
> We have optimize_vectors_before_lowering_p but we shouldn't even there
> turn supported into not supported ops and as said, what's supported or
> not cannot be finally decided (if it's only vcond and not vcond_mask
> that is supported).  Also optimize_vectors_before_lowering_p is set
> for a short time between vectorization and vector lowering and we
> definitely do not want to turn supported vectorizer emitted stmts
> into ones that we need to lower.  For GCC 15 we should see to move
> vector lowering before vectorization (before loop optimization I'd
> say) to close this particula hole (and also reliably ICE when the
> vectorizer creates unsupported IL).  We also definitely want to
> retire vcond expanders (no target I know of supports single-instruction
> compare-and-select).
>
> So short term we either live with this regression (the testcase
> verifies we perform constant folding to { 0, 0 }), implement
> the four missing patterns (qi, hi, si and di missing value mode
> vcond_mask patterns) or see to implement generic code for this.
>
> Given precedent I'd tend towards adding the x86 patterns.
>
> Hongtao, can you handle that?
Sure, I'll take a look.
>
> Thanks,
> Richard.



-- 
BR,
Hongtao

Reply via email to