On Mon, May 23, 2022 at 12:49 PM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > Hi Uros, > Hopefully, if I explain even more of the context, you'll better understand why > this harmless (and at worse seemingly redundant) peephole2 is actually > critical > for addressing significant regressions in the compiler without introducing new > testsuite failures. I wouldn't ask (again), if I didn't feel it's important. > > Basically, I'm trying to unblock Hongtao's patch (for PR target/104610) > which in your own review, explained is better handled by/during STV: > https://gcc.gnu.org/pipermail/gcc-patches/2022-May/594070.html > > Unfortunately, that patch of mine to STV (that I want to ping next) that > solves > the P2 code quality regression PR target/70321, is itself blocked by another > review of yours: > https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593200.html > where this fix (alone) leads to a regression of the test case pr65105-5.c. > > This pending regression has nothing to do with TARGET_BMI's andn, but > the idiom "if ((x & y) != y)" on ia32, where x and y are DImode, and > stv/reload > has decided to place these values in SSE registers. > > After combine we have an *anddi3_doubleword and *cmpdi3_doubleword: > (insn 22 21 23 4 (parallel [ > (set (reg:DI 97) > (and:DI (reg/v:DI 92 [ p2 ]) > (reg:DI 88 [ _25 ]))) > (clobber (reg:CC 17 flags)) > ]) "pr65105-5.c":20:18 530 {*anddi3_doubleword} > (expr_list:REG_UNUSED (reg:CC 17 flags) > (nil))) > (insn 23 22 24 4 (set (reg:CCZ 17 flags) > (compare:CCZ (reg/v:DI 92 [ p2 ]) > (reg:DI 97))) "pr65105-5.c":20:8 29 {*cmpdi_doubleword} > (expr_list:REG_DEAD (reg:DI 97) > (nil)))
But originally, during combine we have (pr65105-5.c): Trying 22 -> 23: 22: {r97:DI=r92:DI&r88:DI;clobber flags:CC;} REG_UNUSED flags:CC 23: {r98:DI=r92:DI^r97:DI;clobber flags:CC;} REG_DEAD r97:DI REG_UNUSED flags:CC Successfully matched this instruction: (parallel [ (set (reg:DI 98) (and:DI (not:DI (reg:DI 88 [ _25 ])) (reg/v:DI 92 [ p2 ]))) (clobber (reg:CC 17 flags)) ]) allowing combination of insns 22 and 23 original costs 8 + 8 = 16 replacement cost 16 deferring deletion of insn with uid = 22. modifying insn i3 23: {r98:DI=~r88:DI&r92:DI;clobber flags:CC;} REG_UNUSED flags:CC deferring rescan insn with uid = 23. so combine is creating: (insn 23 22 24 4 (parallel [ (set (reg:DI 98) (and:DI (not:DI (reg:DI 88 [ _25 ])) (reg/v:DI 92 [ p2 ]))) (clobber (reg:CC 17 flags)) ]) "pr65105-5.c":20:8 552 {*andndi3_doubleword} (expr_list:REG_UNUSED (reg:CC 17 flags) (nil))) why is this not the case anymore with your patch? Uros.