https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82370
--- Comment #4 from Peter Cordes <peter at cordes dot ca> --- VPANDQ can be shorter than an equivalent VPAND, for displacements > 127 but <= 16 * 127 or 32 * 127, and that are an exact multiple of the vector width. EVEX with disp8 always implies a compressed displacement. (See Intel manual vol.2 2.6.5 Compressed Displacement (disp8*N) Support in EVEX). # worst case for EVEX: odd displacement forcing a disp32 while VEX can use disp8 c5 f9 db 4e 01 vpand 0x1(%rsi),%xmm0,%xmm1 62 f1 fd 08 db 8e 01 00 00 00 vpandq 0x1(%rsi),%xmm0,%xmm1 # Best case for EVEX, where it wins by byte # (or two vs. a 3-byte VEX + disp32, e.g. if I'd used %r10) c5 09 db be 00 02 00 00 vpand 0x200(%rsi),%xmm14,%xmm15 62 71 8d 08 db 7e 20 vpandq 0x200(%rsi),%xmm14,%xmm15 # But the tables turn with an odd offset, where EVEX has to use disp32 c5 09 db be ff 01 00 00 vpand 0x1ff(%rsi),%xmm14,%xmm15 62 71 8d 08 db be ff 01 00 00 vpandq 0x1ff(%rsi),%xmm14,%xmm15