The problem being addressed is that expand_mult expects that any mode that supports multiply also support shift. In practice that's a fairly valid assumption -- I've never seen an ISA for which this didn't hold.
However, it means that if you jump through hoops in the backend to provide a (vector) multiply, you'd better jump through those same hoops to provide a (vector) shift. On the good side, with the costs adjusted properly we can get the complicated multiply expansion being reduced to a (sequence of) paddb insns. Two patches to clean things up so that the final patch to support vector shifts in V*QImode is more readable. Tested all together on x86_64-linux. Visual spot checks of -mavx and -mavx2 code. r~ Richard Henderson (3): i386: Extract the guts of mulv16qi3 to ix86_expand_vecop_qihi i386: Pass ix86_expand_sse_unpack operands by value PR target/53749 gcc/ChangeLog | 25 +++++++ gcc/config/i386/i386-protos.h | 4 +- gcc/config/i386/i386.c | 164 +++++++++++++++++++++++++++++++++++++---- gcc/config/i386/i386.md | 3 + gcc/config/i386/sse.md | 147 +++++++++--------------------------- 5 files changed, 217 insertions(+), 126 deletions(-) -- 1.7.10.2