http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59393

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Target|                            |mips16
   Target Milestone|---                         |4.8.3

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
  <bb 2>:
  s_4 = &key_3(D)->S[0];
...
  _15 = _14 * 8;
  _16 = s_4 + _15;
  _17 = *_16;
...
  _21 = _20 * 8;
  _22 = s_4 + _21;
  _23 = *_22;
...

formerly we'd have created

  _17 = MEM[key_3].S[_14];
...
  _23 = MEM[key_3].S[_20];

which isn't a valid transform.  That eventually gets us better addressing
mode selection?  At RTL this probably (didn't verify) re-associates the
key_3 + offsetof(S) + index * 8 expression to a more suitable way and
by-passes the multiple-use restriction of combine (forwprop here un-CSEs
key_3 + offsetof(S)).

In a loop IVOPTs would be the one to utilize target addressing mode
information and eventually generate a TARGET_MEM_REF.  In non-loops
we have SLSR (gimple-ssa-strength-reduction.c) that could serve as a
vehicle to generate TARGET_MEM_REFs (it doesn't).

In the end I would point at RTL forwprop which is supposed to improve
addressing-mode selection.  At least on x86_64 I see

        leaq    144(%rsi), %rax
...
        xorq    4096(%rax,%rbx,8), %r8
        addl    6144(%rax,%r9,8), %r8d

as well (and %rsi is live as well), instead of folding the 144 into
the dereference offset.

forwprop sees

(insn 8 5 9 2 (parallel [
            (set (reg/v/f:DI 85 [ s ])
                (plus:DI (reg/v/f:DI 991 [ key ])
                    (const_int 144 [0x90])))
            (clobber (reg:CC 17 flags))
...
(insn 20 19 21 3 (set (reg:DI 998 [ *_22 ])
        (mem:DI (plus:DI (mult:DI (reg:DI 995)
                    (const_int 8 [0x8]))
                (reg/v/f:DI 85 [ s ])) [2 *_22+0 S8 A64]))
...

and then combine folds in an additional addition:

Trying 18 -> 20:
Successfully matched this instruction:
(set (reg:DI 998 [ *_22 ])
    (mem:DI (plus:DI (plus:DI (mult:DI (reg:DI 994 [ D.1883 ])
                    (const_int 8 [0x8]))
                (reg/v/f:DI 85 [ s ]))
            (const_int 2048 [0x800])) [2 *_22+0 S8 A64]))

but of course doesn't consider insn 8 (it's cross basic-block and it has
multiple uses).

Now there isn't any further forwprop pass after combine (which would maybe
now fold in the addition - not sure).  Certainly ira/lra/reload do not
consider materializing the def in-place either instead of spilling it
for you.

Not sure how the situation is on mips16, but in the end RTL optimizers
are supposed to fixup anything related to addressing mode selection.

Reply via email to