https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119181
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2025-03-10
Summary|Missed vectorization due to |Missed vectorization due to
|imperfect SLP discovery for |imperfect SLP discovery for
|2 grouped load with same |2 grouped load with same
|base pointer(taken as 1 |base pointer (taken as 1
|interleaved load) |interleaved load)
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
The issue is we detect this as a single interleaving group:
t.c:12:1: note: Detected interleaving load of size 264
t.c:12:1: note: _1 = *a_26(D);
t.c:12:1: note: _5 = MEM[(double *)a_26(D) + 8B];
t.c:12:1: note: _7 = MEM[(double *)a_26(D) + 16B];
t.c:12:1: note: _11 = MEM[(double *)a_26(D) + 24B];
t.c:12:1: note: _14 = MEM[(double *)a_26(D) + 32B];
t.c:12:1: note: _17 = MEM[(double *)a_26(D) + 40B];
t.c:12:1: note: _19 = MEM[(double *)a_26(D) + 48B];
t.c:12:1: note: _22 = MEM[(double *)a_26(D) + 56B];
t.c:12:1: note: <gap of 248 elements>
t.c:12:1: note: _2 = MEM[(double *)a_26(D) + 2048B];
t.c:12:1: note: _4 = MEM[(double *)a_26(D) + 2056B];
t.c:12:1: note: _8 = MEM[(double *)a_26(D) + 2064B];
t.c:12:1: note: _10 = MEM[(double *)a_26(D) + 2072B];
t.c:12:1: note: _13 = MEM[(double *)a_26(D) + 2080B];
t.c:12:1: note: _16 = MEM[(double *)a_26(D) + 2088B];
t.c:12:1: note: _20 = MEM[(double *)a_26(D) + 2096B];
t.c:12:1: note: _23 = MEM[(double *)a_26(D) + 2104B];
so the heuristic to swap operands to get a single group in leafs doesn't
work. Instead you get offsetting costs to avoid runaway with very large
gaps:
*a_26(D) 132 times unaligned_load (misalign -1) costs 1584 in body
and that makes it unprofitable.
There is indeed some better heuristic needed where to split groups - gaps
bigger than the biggest vector size might be a good candidate. Note
when two different interleaving groups are used in the same SLP leaf
we fail as we don't support that yet.