Hi!

On Mon, Jul 11, 2022 at 10:13:41AM +0800, HAO CHEN GUI wrote:
> I did a biset for the problem. After commit "commit 8d2d39587: combine: Do 
> not combine
> moves from hard registers", the case fails. The root cause is it can't 
> combine from the
> hard registers and has to use subreg which causes its high part to be 
> undefined. Thus,
> there is an additional "AND" generated.
> 
> Before the commit
> Trying 2 -> 7:
>     2: r125:DI=%3:DI
>       REG_DEAD %3:DI
>     7: r128:SI=r125:DI#0 0>>0x1f
>       REG_DEAD r125:DI
> Successfully matched this instruction:
> (set (reg:SI 128 [ x ])
>     (lshiftrt:SI (reg:SI 3 3 [ x ])
>         (const_int 31 [0x1f])))
> allowing combination of insns 2 and 7
> 
> After the commit
> Trying 20 -> 7:
>    20: r125:DI=r132:DI
>       REG_DEAD r132:DI
>     7: r128:SI=r125:DI#0 0>>0x1f
>       REG_DEAD r125:DI
> Failed to match this instruction:
> (set (subreg:DI (reg:SI 128 [ x ]) 0)
>     (zero_extract:DI (reg:DI 132)
>         (const_int 32 [0x20])
>         (const_int 1 [0x1])))
> Successfully matched this instruction:
> (set (subreg:DI (reg:SI 128 [ x ]) 0)
>     (and:DI (lshiftrt:DI (reg:DI 132)
>             (const_int 31 [0x1f]))
>         (const_int 4294967295 [0xffffffff])))
> allowing combination of insns 20 and 7
> 
> The problem should be fixed in another case? Please advice.

You should not change the expected counts to what is currently
generated.  What is currently generated is sub-optimal.  It all starts
with those zero_extracts, which are always bad for us -- it is a harder
to manipulate representation of a limited subset of more basic
operations we *do* have.  And combine and simplify can handle the more
general and simpler formulation just fine.

Ideally combine would not try to use *_extract at all if this is not
used in the machine description (compare to rotatert for example, a
similarly redundant operation).  But it currently needs it as
intermediate form, untangling this all is quite a bit of work.

These testcases (all the rl* ones) should have a big fat comment
explaining what the expected, wanted code is.

This was easier to do originally, when I actually tested all 65536
possibly combinations, because the expected counts were more "regular"
numbers.  But this is too slow to test in normal testsuite runs :-)

It is wrong to pretend the current state makes the wanted code, these
testcases are meant to show exactly when we make suboptimal machine
code :-)


Segher

Reply via email to