Hi! On Mon, Jul 11, 2022 at 10:13:41AM +0800, HAO CHEN GUI wrote: > I did a biset for the problem. After commit "commit 8d2d39587: combine: Do > not combine > moves from hard registers", the case fails. The root cause is it can't > combine from the > hard registers and has to use subreg which causes its high part to be > undefined. Thus, > there is an additional "AND" generated. > > Before the commit > Trying 2 -> 7: > 2: r125:DI=%3:DI > REG_DEAD %3:DI > 7: r128:SI=r125:DI#0 0>>0x1f > REG_DEAD r125:DI > Successfully matched this instruction: > (set (reg:SI 128 [ x ]) > (lshiftrt:SI (reg:SI 3 3 [ x ]) > (const_int 31 [0x1f]))) > allowing combination of insns 2 and 7 > > After the commit > Trying 20 -> 7: > 20: r125:DI=r132:DI > REG_DEAD r132:DI > 7: r128:SI=r125:DI#0 0>>0x1f > REG_DEAD r125:DI > Failed to match this instruction: > (set (subreg:DI (reg:SI 128 [ x ]) 0) > (zero_extract:DI (reg:DI 132) > (const_int 32 [0x20]) > (const_int 1 [0x1]))) > Successfully matched this instruction: > (set (subreg:DI (reg:SI 128 [ x ]) 0) > (and:DI (lshiftrt:DI (reg:DI 132) > (const_int 31 [0x1f])) > (const_int 4294967295 [0xffffffff]))) > allowing combination of insns 20 and 7 > > The problem should be fixed in another case? Please advice.
You should not change the expected counts to what is currently generated. What is currently generated is sub-optimal. It all starts with those zero_extracts, which are always bad for us -- it is a harder to manipulate representation of a limited subset of more basic operations we *do* have. And combine and simplify can handle the more general and simpler formulation just fine. Ideally combine would not try to use *_extract at all if this is not used in the machine description (compare to rotatert for example, a similarly redundant operation). But it currently needs it as intermediate form, untangling this all is quite a bit of work. These testcases (all the rl* ones) should have a big fat comment explaining what the expected, wanted code is. This was easier to do originally, when I actually tested all 65536 possibly combinations, because the expected counts were more "regular" numbers. But this is too slow to test in normal testsuite runs :-) It is wrong to pretend the current state makes the wanted code, these testcases are meant to show exactly when we make suboptimal machine code :-) Segher