https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96933

--- Comment #4 from Segher Boessenkool <segher at gcc dot gnu.org> ---
Yes, timing suggests there is some SHL/LHS flush.

On p9 and later we can use mtvsrdd instead of mtvsrd (moving two
bytes into place at one), which reduces the number of moves from
16 to 8, and the number of merges from 15 to 7 (and reduces path
length by 1).  This sounds like a no-brainer win with that :-)

Reply via email to