------- Comment #5 from ubizjak at gmail dot com  2007-07-12 07:05 -------
(In reply to comment #3)

> regmove should have changed that but it does not probably because the final
> constraint does not have a duplicate operand.  Actually, I think you want to
> look at anddi_1_rex64, not adddi_1_rex64:

Yes, of course. anddi is the problematic insn.
> 
>   [(set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r,r")
>         (and:DI (match_operand:DI 1 "nonimmediate_operand" "%0,0,0,qm")
>                 (match_operand:DI 2 "x86_64_szext_general_operand"
> "Z,re,rm,L")))
>    (clobber (reg:CC FLAGS_REG))]
> 
> The final constraint is for when and is used to create a zero-extending moves
> (L matches constants 0xFF and 0xFFFF).  I would say that you have to 1) define
> a predicate which has the same behavior as L and 2) split that alternative out
> of the three anddi patterns that use it (grep for '\<L\>') into a separate
> insn.

Hm, please note, that we are not operating with constants, but strictly with
registers. We are dealing with alternative 2 "=rm/%0/re", so I think that
splitting L out of insn pattern would not have a desired effect.

However, I conducted a little experiment and changed (data & m1) into
(data - m1). minus pattern is not commutative, and in lreg pass we have
following sequence (I remove clobber for TImode reg in the middle):

(insn:HI 24 23 25 2 pr32725.c:24 (set (reg:DI 74)
        (zero_extend:DI (mem:HI (plus:DI (mult:DI (reg:DI 73)
                        (const_int 2 [0x2]))
                    (reg/v/f:DI 65 [ src ])) [3 S2 A16]))) 114
{zero_extendhidi2
} (expr_list:REG_DEAD (reg:DI 73)
        (nil)))

(insn:HI 25 24 28 2 pr32725.c:24 (parallel [
            (set (reg:DI 74)
                (minus:DI (reg:DI 74)
                    (reg:DI 71)))
            (clobber (reg:CC 17 flags))
        ]) 237 {*subdi_1_rex64} (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))

(insn:HI 33 28 37 2 pr32725.c:24 (parallel [
            (set (reg:TI 79)
                (mult:TI (zero_extend:TI (reg:DI 74))
                    (zero_extend:TI (reg:DI 70))))
            (clobber (reg:CC 17 flags))
        ]) 264 {*umulditi3_insn} (expr_list:REG_DEAD (reg:DI 74)
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))

A natural selection for reg 74 would be %rax that satisfies the constraints of
the whole sequence. It indeed _looks_ like Rask is saying, that allocator
doesn't notice REG_DEAD note in insn 24 and somehow blocks the use of %rax for
minus and mult expr. This leads to extra reload (insn 101):

(insn:HI 24 23 25 2 pr32725.c:24 (set (reg:DI 3 bx [74])
        (zero_extend:DI (mem:HI (plus:DI (mult:DI (reg:DI 0 ax [73])
                        (const_int 2 [0x2]))
                    (reg/v/f:DI 4 si [orig:65 src ] [65])) [3 S2 A16]))) 114
{zero_extendhidi2} (nil))

(insn:HI 25 24 101 2 pr32725.c:24 (parallel [
            (set (reg:DI 3 bx [74])
                (minus:DI (reg:DI 3 bx [74])
                    (reg:DI 38 r9 [71])))
            (clobber (reg:CC 17 flags))
        ]) 237 {*subdi_1_rex64} (nil))

(insn 101 25 33 2 pr32725.c:24 (set (reg:DI 0 ax)
        (reg:DI 3 bx [74])) 82 {*movdi_1_rex64} (nil))

(insn:HI 33 101 102 2 pr32725.c:24 (parallel [
            (set (reg:TI 0 ax)
                (mult:TI (zero_extend:TI (reg:DI 0 ax))
                    (zero_extend:TI (reg:DI 39 r10 [70]))))
            (clobber (reg:CC 17 flags))
        ]) 264 {*umulditi3_insn} (nil))


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32725

Reply via email to