https://gcc.gnu.org/bugzilla/show_bug.cgi?id=73350
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Created attachment 39462 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39462&action=edit gcc7-pr73350-wip.patch Untested patch with what I had in mind for the mask operands. Perhaps some extra arch specific pass that would optimize the VEC_MERGE with -1 mask into non-VEC_MERGE would be helpful too (or do it in stv pass?), otherwise it will sometimes be folded in cse, sometimes in combine, but with no guarantees.