Re: [PATCH][RFC] Bug handling SUBREG (MEM) - MEM having side-effects?

Richard Sandiford Sun, 18 Nov 2012 02:55:28 -0800

Tejas Belagod <tbela...@arm.com> writes:

> Hi,
>
> I seem to have uncovered what seems to be a bug with handling SUBREG (MEM) 
> with
> the MEM having side-effects while testing aarch64-4.7. This bug seems to be
> latent on trunk.
>
> Here is a test case reduced from gcc.c-torture/execute/scal-to-vec1.c.
>
> #define vector(elcount, type)  \
> __attribute__((vector_size((elcount)*sizeof(type)))) type
>
> #define vidx(type, vec, idx) (*((type *) &(vec) + idx))
>
> #define operl(a, b, op) (a op b)
> #define operr(a, b, op) (b op a)
>
> #define check(type, count, vec0, vec1, num, op, lr) \
> do {\
>         int __i; \
>         for (__i = 0; __i < count; __i++) {\
>             if (vidx (type, vec1, __i) != oper##lr (num, vidx (type, vec0, 
> __i),
> op)) \
>                 __builtin_abort (); \
>         }\
> } while (0)
>
> #define veccompare(type, count, v0, v1) \
> do {\
>         int __i; \
>         for (__i = 0; __i < count; __i++) { \
>             if (vidx (type, v0, __i) != vidx (type, v1, __i)) \
>                 __builtin_abort (); \
>         } \
> } while (0)
>
> volatile int one = 1;
>
> int main (int argc, char *argv[]) {
>
>         vector(8, short) v0 = {one, 1, 2, 3, 4, 5, 6, 7};
>         vector(8, short) v1;
>         v1 = 2 << v0;   check (short, 8, v0, v1, 2, <<, l);
>         return 0;
> }
>
> When compiled with -O1, the combiner tries to combine these two instructions:
>
> (insn 87 86 89 4 (set (reg:SI 154 [ MEM[base: D.2294_40, offset: 0B] ])
>             (sign_extend:SI (mem:HI (pre_inc:DI (reg:DI 113 [ ivtmp.23 ])) [0
> MEM[base: D.2294_40, offset: 0B]+0 S2 A16]))) scal-to-vec1.c:35 54
> {*extendhisi2_aarch64}
>          (expr_list:REG_INC (reg:DI 113 [ ivtmp.23 ])
>             (nil)))
>
> (insn 89 87 90 4 (set (reg:SI 155)
>             (ashift:SI (reg:SI 158)
>                 (subreg:QI (reg:SI 154 [ MEM[base: D.2294_40, offset: 0B] ]) 
> 0)))
> scal-to-vec1.c:35 331 {*ashlsi3_insn}
>          (expr_list:REG_DEAD (reg:SI 154 [ MEM[base: D.2294_40, offset: 0B] ])
>             (nil)))
>
> It tries to forward-propagate reg 154 to insn 89 and the intermediate RTL 
> looks
> like this:
>
> (insn 89 87 90 4 (set (reg:SI 155)
>             (ashift:SI (reg:SI 158)
>                 (subreg:QI ((sign_extend:SI (mem:HI (pre_inc:DI (reg:DI 113
>                          [ivtmp.23 ])) [0 MEM[base:
>                      D.2294_40, offset: 0B]+0 S2 A16])))
>
> The combiner then recursively tries to simplify this RTL. During simplifying 
> the
> subreg part of the RTL i.e.
>
> (subreg:QI ((sign_extend:SI (mem:HI (pre_inc:DI (reg:DI 113 ivtmp.23 ])) [0
>                          MEM[base: D.2294_40, offset:
>                              0B]+0 S2 A16])))
>
> combine_simplify_rtx () calls simplify_subreg () and this returns
>
> (subreg:QI (mem:HI (pre_inc:DI (reg:DI 113 ivtmp.23 ])) [0
>                  MEM[base: D.2294_40, offset: 0B]+0 S2 A16])))
>
> The code that does this is in combine.c:combine_simplify_rtx ()
>
> ...
>      rtx temp;
>      temp = simplify_subreg (mode, SUBREG_REG (x), op0_mode,
>                  SUBREG_BYTE (x));
>      if (temp)
>        return temp;
>           }
> ...
>
> So, the above tree is returned as is and insn 87 gets combined into insn 89 to
> become
>
> (insn 89 87 90 4 (set (reg:SI 155)
>             (ashift:SI (reg:SI 158)
>         (subreg:QI (mem:HI (pre_inc:DI (reg:DI 113 ivtmp.23 ])) [0
>                  MEM[base: D.2294_40, offset: 0B]+0 S2 A16])))
>
> This ends up in reload and in cleanup_subreg_operands () the subreg:QI 
> (mem:HI)
> gets narrowed to mem:QI, but the interesting bit is that the pre_inc stays the
> same, so becomes a pre_inc for a QI mode address rather than the original HI
> mode address and therefore instead of addressing a byte and incrementing the
> address by 2 to get the next half word, the address gets incremented by 1!
>
> Also, I feel this is a latent bug on the trunk. This is because while the
> combiner takes the same path on the trunk, the address expression is not a
> pre_inc there - the trunk wants to keep the addressing to a mem (plus (reg)
> (const)) and therefore combine_simplify_rtx () simplifies the subreg 
> expression
> right down to the narrowed memref i.e. mem:QI (plus:DI (reg:DI) (const)). This
> gets combined to insn 89 to become
>
> (insn 89 87 90 4 (set (reg:SI 155)
>             (ashift:SI (reg:SI 158) (mem:QI (plus:DI (reg:DI) (const))))
>
> This then, gets later checked with recog () and since the predicate for 
> operand
> 2 does not allow memory operands for shifts in aarch64.md, does not get 
> combined
> and all is well.
>
> After the discussions Richard Earnshaw had on IRC with Andrew Pinski, it was
> felt that it was best to fix the standard predicate
> 'general_operand' to not allow SUBREG (MEM) with side-effects as it has known
> issues associated with it - particularly reload not being able to deal with 
> such
> cases. 'general_operand' currently outlaws cases like paradoxical SUBREG (MEM)
> on targets with insn scheduling and SUBREG (MEM) with non-zero SUBREG_BYTE. 
> This
> is another case it should not allow. Here is a patch that tightens
> general_operands to not allow SUBREG (MEM) with MEM having side-effects.
>
> I would like to hear some thoughts on this.


LGTM.

It would be good to get rid of subreg mem operands altogether at some point,
but that's a little too drastic for stage 3.  This patch looks like a strict
improvement.

Richard

Re: [PATCH][RFC] Bug handling SUBREG (MEM) - MEM having side-effects?

Reply via email to