Since SLP graph partitioning works on scalar stmts (because it's done for costing) we have to make sure to visit permute nodes multiple times since they will not pull partitions together.
Bootstrapped / tested on x86_64-unknown-linux-gnu, pushed. 2021-04-06 Richard Biener <rguent...@suse.de> PR tree-optimization/99924 * tree-vect-slp.c (vect_bb_partition_graph_r): Do not mark nodes w/o scalar stmts as visited. * gfortran.dg/vect/pr99924.f90: New testcase. --- gcc/testsuite/gfortran.dg/vect/pr99924.f90 | 12 ++++++++++++ gcc/tree-vect-slp.c | 2 +- 2 files changed, 13 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gfortran.dg/vect/pr99924.f90 diff --git a/gcc/testsuite/gfortran.dg/vect/pr99924.f90 b/gcc/testsuite/gfortran.dg/vect/pr99924.f90 new file mode 100644 index 00000000000..f271ea1d0d5 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/vect/pr99924.f90 @@ -0,0 +1,12 @@ +! { dg-do compile } +! { dg-additional-options "-march=armv8.3-a" { target aarch64-*-* } } +subroutine cunhj (tfn, asum, bsum) + implicit none + complex :: up, tfn, asum, bsum + real :: ar + + up = tfn * ar + bsum = up + ar + asum = up + asum + return +end subroutine cunhj diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c index ceec7f5c410..58dedfc35b7 100644 --- a/gcc/tree-vect-slp.c +++ b/gcc/tree-vect-slp.c @@ -4224,7 +4224,7 @@ vect_bb_partition_graph_r (bb_vec_info bb_vinfo, stmt_instance = instance; } - if (visited.add (node)) + if (!SLP_TREE_SCALAR_STMTS (node).is_empty () && visited.add (node)) return; slp_tree child; -- 2.26.2