https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68655
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Created attachment 36897 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36897&action=edit gcc6-pr68655.patch Initial untested patch. Unfortunately, it doesn't seem to be always a win, when looking at the differences between old and new compiler. I'm looking at cd /usr/src/gcc/gcc/testsuite/gcc.dg/torture; for i in vshuf-v*[hqs]i.c; do for j in -msse2 -msse4 -mavx -mavx2 -mavx512f -mavx512bw; do /usr/src/gcc/obj/gcc/cc1.v246 -quiet -O2 $j $i -DEXPENSIVE -o /tmp/1.s; /usr/src/gcc/obj/gcc/cc1 -quiet -O2 $j $i -DEXPENSIVE -o /tmp/2.s; echo ===$i $j===; diff -up /tmp/1.s /tmp/2.s; done; done output now (where cc1.v246 is vanilla cc1, cc1 is one with this patch applied). In some cases the patch helps, but I've seen so far some cases where for AVX512* it resulted in more instructions.