https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80286
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Created attachment 41111 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41111&action=edit gcc7-pr80286.patch Untested fix for the correctness part. Now if the shift count comes from memory or XMM register, it would be certainly faster and shorter to emit v?pmovzxdq before the shift (for SSE4.1+) or pxor; punpck* for SSE2, wonder how to do that best though; first of all it would be nice to transform sign-extension followed by shift using the sign-extended count into zero-extension followed by shift, because if the shift count is negative, the insn will handle it as very large count no matter if it is zero or sign extended. The second question is if zero_extendsidi2 etc. instruction should have a =v/vm alternative at least for sse4 isa, but it already has some alternative with x and uses ? to make it not likely to be used.