https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117502
Bug ID: 117502
Summary: Fail to SLP gcc.target/aarch64/sve/pr95199.c
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Target Milestone: ---
We fail to SLP the non-stride 1 version of gcc.target/aarch64/sve/pr95199.c
where we apply stride versioning. Without SLP we can successfully use gathers
while with SLP we end up with
pr95199.c:8:21: missed: Not using elementwise accesses due to variable
vectorization factor.
pr95199.c:10:8: missed: not vectorized: relevant stmt not supported: _4 =
*_3;
pr95199.c:8:21: note: unsupported SLP instance starting from: *_3 = _10;
pr95199.c:8:21: missed: unsupported SLP instances
this is easier visible when adding -fno-version-loops-for-strides
We fail to get
pr95199.c:8:21: note: ==> examining statement: _4 = *_3;
pr95199.c:8:21: note: using gather/scatter for strided/grouped access, scale
= pr95199.c:8:21: note: vect_model_load_cost: inside_cost = 2, prologue_cost
= 0 .
The reason is that we're using VMAT_STRIDED_SLP and consider gather only
for VMAT_ELEMENTWISE, failing to realize that the caller (get_load_store_type)
will reject both in case of a variable length access. There's also practically
no difference between VMAT_ELEMENTWISE and VMAT_STRIDED_SLP for single element
accesses.