https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90073
--- Comment #5 from Marc Glisse <glisse at gcc dot gnu.org> --- Looking at the attached asm, the main issue is PR 55266 (there should be no copying), and how exactly the copies are done (64/128/256 bits) is almost a detail...