https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116760

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
So on x86 the cost model difference 14.2 vs trunk is

-(*co_271(D))[_95] 1 times vec_construct costs 792 in body
+(*co_271(D))[_95] 1 times vec_construct costs 88 in body

and similar for

-_103 1 times vec_to_scalar costs 72 in body
+_103 1 times vec_to_scalar costs 8 in body

r15-5565-gdbc38dd9e96a99 doesn't seem to fix this yet.  The reason is
that the cost hook for non-SLP considers VMAT_ELEMENTWISE with variable
stride separately but not so VMAT_STRIDED_SLP with SLP.  With SLP we don't
get all the info we like (how we use lvectype/ltype vs. vectype).

For GCC 15 I'm going to emulate GCC 14 behavior here by special-casing
single-lane SLP.  For the future we want to let the backend know how many
and what kind of loads we do for VMAT_STRIDED_SLP, that's something the
cost hook doesn't get us yet.

Reply via email to