[Bug rtl-optimization/83272] Unnecessary mask instruction generated

jakub at gcc dot gnu.org Mon, 04 Dec 2017 10:52:00 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83272


Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I don't believe the andl is not needed after shrb, as that is an 8-bit operand
size, it should leave the upper 56 bits of the register unmodified.  And
unsigned char argument is in the ABI passed as int, so I think the upper 32
bits are undefined, which the andl instruction clears.  Perhaps using shrl $4,
%edi
would be sufficient if the argument really has to be zero extended.
In your
static const char map[16] = "xyz";
void foo(unsigned char *src, char *buf)
{
        int hi = src[0] / 16;
        int lo = src[0] % 16;
        buf[0] = map[hi];
        buf[1] = map[lo];
}
testcase, I suppose we'd need to use REE to figure out something the generic
code can't know, in particular that the movzbl instruction also clears the
upper 32 bits and that shrb instruction doesn't modify the upper 32 bits.
REE sees:
(insn 7 4 21 2 (set (reg:QI 0 ax [orig:87 _1 ] [87])
        (mem:QI (reg/v/f:DI 5 di [orig:94 src ] [94]) [0 *src_6(D)+0 S1 A8]))
"pr83272.c":4 88 {*movqi_internal}
     (nil))
(insn 21 7 9 2 (set (reg:QI 1 dx [97])
        (reg:QI 0 ax [orig:87 _1 ] [87])) "pr83272.c":4 88 {*movqi_internal}
     (nil))
(insn 9 21 10 2 (parallel [
            (set (reg:QI 1 dx [97])
                (lshiftrt:QI (reg:QI 1 dx [97])
                    (const_int 4 [0x4])))
            (clobber (reg:CC 17 flags))
        ]) "pr83272.c":4 585 {*lshrqi3_1}
     (nil))
(insn 10 9 11 2 (set (reg:DI 1 dx [orig:98 hi ] [98])
        (zero_extend:DI (reg:QI 1 dx [97]))) "pr83272.c":4 136
{zero_extendqidi2}
     (nil))
...
(insn 15 14 16 2 (parallel [
            (set (reg:DI 0 ax [orig:102 lo ] [102])
                (and:DI (reg:DI 0 ax [orig:87 _1 ] [87])
                    (const_int 15 [0xf])))
            (clobber (reg:CC 17 flags))
        ]) "pr83272.c":5 412 {*anddi_1}
     (nil))

The & 15 in that case is a must, that is the % 16 from the source, the upper
bits can be set.  And so the only thing that can be eliminated is the movzbl
insn 10, but in order to find that out we'd need to understand first that insn
7 clears upper 56 bits of the register, that insn 21 if we actually emit movl
%eax, %edx copies the low 32 bits (where from the earlier insn the upper 24
bits are cleared) and clears the high 32 bits (note if we emit instead movb
%al, %dl, then this doesn't hold, as it leaves upper 56 bits of %rdx
unmodified), and finally that insn 9 keeps upper 56 bits of %rdx unmodified and
from previous insns we have guaranteed zeros there.
In any case, this is far beyond what current REE can do ATM.

[Bug rtl-optimization/83272] Unnecessary mask instruction generated

Reply via email to