https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45215

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|tree-optimization           |rtl-optimization

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note on the trunk I have change the code slightly to get a cmove done.
With the cmove we could simplify the following RTL:
Trying 27, 28 -> 29:
   27: {flags:CCZ=cmp(r86:SI&0x100,0);r82:SI=r86:SI&0x100;}
      REG_DEAD r86:SI
   28: r85:SI=0xffffffffffffffe6
   29: r82:SI={(flags:CCZ==0)?r82:SI:r85:SI}
      REG_DEAD r85:SI
      REG_DEAD flags:CCZ
Failed to match this instruction:
(set (reg/v:SI 82 [ tt ])
    (if_then_else:SI (eq (zero_extract:SI (reg:SI 86)
                (const_int 1 [0x1])
                (const_int 8 [0x8]))
            (const_int 0 [0]))
        (and:SI (reg:SI 86)
            (const_int 256 [0x100]))
        (const_int -26 [0xffffffffffffffe6])))

But that would be a 3->3 combine which I don't know if combine does.  I know it
does 3->1 and 3->2

        andl    $256, %edi
        movl    $-26, %eax
        cmovne  %eax, %edi

I also don't know what the cost of doing cmov vs the shifts here though.
I know for aarch64, it is worse but that should have been modeled already.

Reply via email to