http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52607
--- Comment #1 from Marc Glisse <marc.glisse at normalesup dot org> 2012-03-17 01:05:57 UTC --- Note that {1,2,0,3} seems harder, I need one extra vpermilpd. Actually, it looks like every v4df shuffle can be realized as a vblendpd of a vpermilpd and a vpermilpd+vperm2f128. For v8sf, it also seems true but may require the version of vpermilps that takes its controls from a register/memory.