https://gcc.gnu.org/g:0192341a07b8ea30f631cf4afdc6fcf3fa7ce838
commit r15-1706-g0192341a07b8ea30f631cf4afdc6fcf3fa7ce838 Author: Richard Biener <rguent...@suse.de> Date: Wed Jun 26 14:07:51 2024 +0200 tree-optimization/115640 - outer loop vect with inner SLP permute The following fixes wrong-code when using outer loop vectorization and an inner loop SLP access with permutation. A wrong adjustment to the IV increment is then applied on GCN. PR tree-optimization/115640 * tree-vect-stmts.cc (vectorizable_load): With an inner loop SLP access to not apply a gap adjustment. Diff: --- gcc/tree-vect-stmts.cc | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 0b0761bf799..7b889f31645 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -10512,9 +10512,14 @@ vectorizable_load (vec_info *vinfo, whole group, not only the number of vector stmts the permutation result fits in. */ unsigned scalar_lanes = SLP_TREE_LANES (slp_node); - if (slp_perm - && (group_size != scalar_lanes - || !multiple_p (nunits, group_size))) + if (nested_in_vect_loop) + /* We do not support grouped accesses in a nested loop, + instead the access is contiguous but it might be + permuted. No gap adjustment is needed though. */ + vec_num = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node); + else if (slp_perm + && (group_size != scalar_lanes + || !multiple_p (nunits, group_size))) { /* We don't yet generate such SLP_TREE_LOAD_PERMUTATIONs for variable VF; see vect_transform_slp_perm_load. */