PR83753 was about a case in which we ended up trying to "vectorise" a group of loads ore stores using single-element vectors. The problem was that we were classifying the load or store as VMAT_CONTIGUOUS_PERMUTE rather than VMAT_CONTIGUOUS, even though it doesn't make sense to permute a single-element vector.
In that PR it was enough to change get_group_load_store_type, because vectorisation ended up being unprofitable and so we didn't take things further. But when vectorisation is profitable, the same fix is needed in vectorizable_load and vectorizable_store. Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu. OK to install? Richard 2018-02-08 Richard Sandiford <richard.sandif...@linaro.org> gcc/ PR tree-optimization/84265 * tree-vect-stmts.c (vectorizable_store): Don't treat VMAT_CONTIGUOUS accesses as grouped. (vectorizable_load): Likewise. gcc/testsuite/ PR tree-optimization/84265 * gcc.dg/vect/pr84265.c: New test. Index: gcc/tree-vect-stmts.c =================================================================== --- gcc/tree-vect-stmts.c 2018-01-30 09:45:27.710764075 +0000 +++ gcc/tree-vect-stmts.c 2018-02-08 13:26:39.242566948 +0000 @@ -6214,7 +6214,8 @@ vectorizable_store (gimple *stmt, gimple } grouped_store = (STMT_VINFO_GROUPED_ACCESS (stmt_info) - && memory_access_type != VMAT_GATHER_SCATTER); + && memory_access_type != VMAT_GATHER_SCATTER + && (slp || memory_access_type != VMAT_CONTIGUOUS)); if (grouped_store) { first_stmt = GROUP_FIRST_ELEMENT (stmt_info); @@ -7696,7 +7697,8 @@ vectorizable_load (gimple *stmt, gimple_ return true; } - if (memory_access_type == VMAT_GATHER_SCATTER) + if (memory_access_type == VMAT_GATHER_SCATTER + || (!slp && memory_access_type == VMAT_CONTIGUOUS)) grouped_load = false; if (grouped_load) Index: gcc/testsuite/gcc.dg/vect/pr84265.c =================================================================== --- /dev/null 2018-02-08 11:17:10.862716283 +0000 +++ gcc/testsuite/gcc.dg/vect/pr84265.c 2018-02-08 13:26:39.240567025 +0000 @@ -0,0 +1,23 @@ +/* { dg-do compile } */ + +struct a +{ + unsigned long b; + unsigned long c; + int d; + int *e; + char f; +}; + +struct +{ + int g; + struct a h[]; +} i; + +int j, k; +void l () +{ + for (; k; k++) + j += (int) (i.h[k].c - i.h[k].b); +}