https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93771
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Last reconfirmed| |2020-02-17 CC| |rguenth at gcc dot gnu.org Blocks| |53947 Ever confirmed|0 |1 --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- Confirmed. I'm not sure if we should try to "fix" SLP here or rather appropriately optimize v2df tem1 = *(v2df *)&t[0]; v2df tem2 = *(v2df *)&t[2]; __builtin_shuffle (tem1, tem2 (v2di) { 0, 3 }); which the user could write itself. forwprop does some related transforms splitting loads in "Rewrite loads used only in BIT_FIELD_REF extractions to component-wise loads." Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations