On Mon, May 23, 2022 at 12:49 PM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
>
> Hi Uros,
> Hopefully, if I explain even more of the context, you'll better understand why
> this harmless (and at worse seemingly redundant) peephole2 is actually 
> critical
> for addressing significant regressions in the compiler without introducing new
> testsuite failures.  I wouldn't ask (again), if I didn't feel it's important.
>
> Basically, I'm trying to unblock Hongtao's patch (for PR target/104610)
> which in your own review, explained is better handled by/during STV:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-May/594070.html
>
> Unfortunately, that patch of mine to STV (that I want to ping next) that 
> solves
> the P2 code quality regression PR target/70321, is itself blocked by another
> review of yours:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593200.html
> where this fix (alone) leads to a regression of the test case pr65105-5.c.
>
> This pending regression has nothing to do with TARGET_BMI's andn, but
> the idiom "if ((x & y) != y)" on ia32, where x and y are DImode, and 
> stv/reload
> has decided to place these values in SSE registers.
>
> After combine we have an *anddi3_doubleword and *cmpdi3_doubleword:
> (insn 22 21 23 4 (parallel [
>             (set (reg:DI 97)
>                 (and:DI (reg/v:DI 92 [ p2 ])
>                     (reg:DI 88 [ _25 ])))
>             (clobber (reg:CC 17 flags))
>         ]) "pr65105-5.c":20:18 530 {*anddi3_doubleword}
>      (expr_list:REG_UNUSED (reg:CC 17 flags)
>         (nil)))
> (insn 23 22 24 4 (set (reg:CCZ 17 flags)
>         (compare:CCZ (reg/v:DI 92 [ p2 ])
>             (reg:DI 97))) "pr65105-5.c":20:8 29 {*cmpdi_doubleword}
>      (expr_list:REG_DEAD (reg:DI 97)
>         (nil)))

But originally, during combine we have (pr65105-5.c):

Trying 22 -> 23:
   22: {r97:DI=r92:DI&r88:DI;clobber flags:CC;}
      REG_UNUSED flags:CC
   23: {r98:DI=r92:DI^r97:DI;clobber flags:CC;}
      REG_DEAD r97:DI
      REG_UNUSED flags:CC
Successfully matched this instruction:
(parallel [
        (set (reg:DI 98)
            (and:DI (not:DI (reg:DI 88 [ _25 ]))
                (reg/v:DI 92 [ p2 ])))
        (clobber (reg:CC 17 flags))
    ])
allowing combination of insns 22 and 23
original costs 8 + 8 = 16
replacement cost 16
deferring deletion of insn with uid = 22.
modifying insn i3    23: {r98:DI=~r88:DI&r92:DI;clobber flags:CC;}
      REG_UNUSED flags:CC
deferring rescan insn with uid = 23.

so combine is creating:

(insn 23 22 24 4 (parallel [
            (set (reg:DI 98)
                (and:DI (not:DI (reg:DI 88 [ _25 ]))
                    (reg/v:DI 92 [ p2 ])))
            (clobber (reg:CC 17 flags))
        ]) "pr65105-5.c":20:8 552 {*andndi3_doubleword}
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))

why is this not the case anymore with your patch?

Uros.

Reply via email to