On Thu, 10 Oct 2024, Richard Sandiford wrote:
> My recent VLA SLP patches caused a regression with cross compilers
> in gcc.dg/torture/neon-sve-bridge.c. There we have a VEC_PERM_EXPR
> created from two BIT_FIELD_REFs, with the child node being an
> external VLA vector:
>
> note: node 0x3704a70 (max_nunits=1, refcnt=2) vector(2) long int
> note: op: VEC_PERM_EXPR
> note: stmt 0 val1Return_9 = BIT_FIELD_REF <sveReturn_8, 64, 0>;
> note: stmt 1 val2Return_10 = BIT_FIELD_REF <sveReturn_8, 64, 64>;
> note: lane permutation { 0[0] 0[1] }
> note: children 0x3704b08
> note: node (external) 0x3704b08 (max_nunits=1, refcnt=1) svint64_t
> note: { }
>
> For this kind of external node, the SLP_TREE_LANES is normally
> the total number of lanes in the vector, but it is zero if the
> vector has variable length:
>
> auto nunits = TYPE_VECTOR_SUBPARTS (SLP_TREE_VECTYPE (vnode));
> unsigned HOST_WIDE_INT const_nunits;
> if (nunits.is_constant (&const_nunits))
> SLP_TREE_LANES (vnode) = const_nunits;
>
> This led to division by zero in:
>
> /* Check whether the output has N times as many lanes per vector. */
> else if (constant_multiple_p (SLP_TREE_LANES (node) * op_nunits,
> SLP_TREE_LANES (child) * nunits,
> &this_unpack_factor)
> && (i == 0 || unpack_factor == this_unpack_factor))
> unpack_factor = this_unpack_factor;
>
> No repetition takes place for this kind of external node, so this
> patch goes with Richard's suggestion to check for external nodes
> that have no scalar statements.
>
> This didn't show up for my native testing since division by zero
> doesn't trap on AArch64.
>
> Bootstrapped & regreesion-tested on aarch64-linux-gnu and spot-checked
> with a cross compiler. OK to install?
OK.
Thanks,
Richard.
> gcc/
> * tree-vect-slp.cc (vectorizable_slp_permutation_1): Set repeating_p
> to false if we have an external node for a pre-existing vector.
> ---
> gcc/tree-vect-slp.cc | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 9bb765e2cba..1991fb1d3b6 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -10288,10 +10288,19 @@ vectorizable_slp_permutation_1 (vec_info *vinfo,
> gimple_stmt_iterator *gsi,
> }
> auto op_nunits = TYPE_VECTOR_SUBPARTS (op_vectype);
> unsigned int this_unpack_factor;
> + /* Detect permutations of external, pre-existing vectors. The external
> + node's SLP_TREE_LANES stores the total number of units in the vector,
> + or zero if the vector has variable length.
> +
> + We are expected to keep the original VEC_PERM_EXPR for such cases.
> + There is no repetition to model. */
> + if (SLP_TREE_DEF_TYPE (child) == vect_external_def
> + && SLP_TREE_SCALAR_OPS (child).is_empty ())
> + repeating_p = false;
> /* Check whether the input has twice as many lanes per vector. */
> - if (children.length () == 1
> - && known_eq (SLP_TREE_LANES (child) * nunits,
> - SLP_TREE_LANES (node) * op_nunits * 2))
> + else if (children.length () == 1
> + && known_eq (SLP_TREE_LANES (child) * nunits,
> + SLP_TREE_LANES (node) * op_nunits * 2))
> pack_p = true;
> /* Check whether the output has N times as many lanes per vector. */
> else if (constant_multiple_p (SLP_TREE_LANES (node) * op_nunits,
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)