https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114107

--- Comment #8 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
(In reply to Hongtao Liu from comment #7)
> perm_cost is very low in x86 backend, and it maybe ok for 128-bit vectors,
> pshufb/shufps are avaible for most cases.
> But for 256/512-bit vectors, when the permuation is cross-lane, the cost
> could be higher. One solution is increase perm_cost when vector size is more
> than 128 since vperm is most likely used instead of
> vblend/vpblend/vpshuf/vshuf.

Furthermore, if we can get indices in the backend when calculating vec_perm
cost, we can check if the permutation is cross-lane or not, and set cost more
accurately for 256/512-bit vector permutation.

Reply via email to