https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116818
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Last reconfirmed| |2024-09-23 Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- I think this is SLP missing the corresponding /* As a last resort, trying using a gather load or scatter store. ??? Although the code can handle all group sizes correctly, it probably isn't a win to use separate strided accesses based on nearby locations. Or, even if it's a win over scalar code, it might not be a win over vectorizing at a lower VF, if that allows us to use contiguous accesses. */ if (*memory_access_type == VMAT_ELEMENTWISE && single_element_p && loop_vinfo && vect_use_strided_gather_scatters_p (stmt_info, loop_vinfo, masked_p, gs_info)) *memory_access_type = VMAT_GATHER_SCATTER; we can try moving that down to also catch SLP now that we set VMAT_ELEMENTWISE in some cases. That fixes the testcase but I have no idea what fallout is elsewhere. I'll propose that to the CI.