https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83272
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> --- I don't believe the andl is not needed after shrb, as that is an 8-bit operand size, it should leave the upper 56 bits of the register unmodified. And unsigned char argument is in the ABI passed as int, so I think the upper 32 bits are undefined, which the andl instruction clears. Perhaps using shrl $4, %edi would be sufficient if the argument really has to be zero extended. In your static const char map[16] = "xyz"; void foo(unsigned char *src, char *buf) { int hi = src[0] / 16; int lo = src[0] % 16; buf[0] = map[hi]; buf[1] = map[lo]; } testcase, I suppose we'd need to use REE to figure out something the generic code can't know, in particular that the movzbl instruction also clears the upper 32 bits and that shrb instruction doesn't modify the upper 32 bits. REE sees: (insn 7 4 21 2 (set (reg:QI 0 ax [orig:87 _1 ] [87]) (mem:QI (reg/v/f:DI 5 di [orig:94 src ] [94]) [0 *src_6(D)+0 S1 A8])) "pr83272.c":4 88 {*movqi_internal} (nil)) (insn 21 7 9 2 (set (reg:QI 1 dx [97]) (reg:QI 0 ax [orig:87 _1 ] [87])) "pr83272.c":4 88 {*movqi_internal} (nil)) (insn 9 21 10 2 (parallel [ (set (reg:QI 1 dx [97]) (lshiftrt:QI (reg:QI 1 dx [97]) (const_int 4 [0x4]))) (clobber (reg:CC 17 flags)) ]) "pr83272.c":4 585 {*lshrqi3_1} (nil)) (insn 10 9 11 2 (set (reg:DI 1 dx [orig:98 hi ] [98]) (zero_extend:DI (reg:QI 1 dx [97]))) "pr83272.c":4 136 {zero_extendqidi2} (nil)) ... (insn 15 14 16 2 (parallel [ (set (reg:DI 0 ax [orig:102 lo ] [102]) (and:DI (reg:DI 0 ax [orig:87 _1 ] [87]) (const_int 15 [0xf]))) (clobber (reg:CC 17 flags)) ]) "pr83272.c":5 412 {*anddi_1} (nil)) The & 15 in that case is a must, that is the % 16 from the source, the upper bits can be set. And so the only thing that can be eliminated is the movzbl insn 10, but in order to find that out we'd need to understand first that insn 7 clears upper 56 bits of the register, that insn 21 if we actually emit movl %eax, %edx copies the low 32 bits (where from the earlier insn the upper 24 bits are cleared) and clears the high 32 bits (note if we emit instead movb %al, %dl, then this doesn't hold, as it leaves upper 56 bits of %rdx unmodified), and finally that insn 9 keeps upper 56 bits of %rdx unmodified and from previous insns we have guaranteed zeros there. In any case, this is far beyond what current REE can do ATM.