On Nov 8, 2011, at 6:13 AM, Alan Modra wrote:

> It's been a while since I looked at what was happening with this
> testcase, but from memory what stops sheduling over the stack_tie is
> alias.c:base_alias_check.  Which relies on tracking registers to see
> whether two memory accesses using different regs could alias.  Quite
> possibly gcc-4.4 is deficient in this area.

 I re-investigated our testcase yesterday and this is indeed what 
 makes the difference. I couldn't reproduce the misbehavior on mainline,
 not because of the recent change you mentioned upthread, but because
 of an earlier one, rev 180522.

 The problem is visible on 180521, and here is what is happening (roughly):
 The testcase (sources below), compiled with -O1 -fschedule-insns2 by a
 powerpc-wrs-vxworks compiler produces in .pro_and_epilogue:

   (insn 29 22 28 2 (set (reg/f:SI 11 11 [133])
          (high:SI (symbol_ref:SI ("mysym11") ...))) ./t.c:19 408 {elf_high}
     (expr_list:REG_EQUIV (high:SI (symbol_ref:SI ("mysym11") ...)

   (insn 28 29 24 2 (set (reg/f:SI 0 0 [orig:123 p11$3 ] [123])
        (lo_sum:SI (reg/f:SI 11 11 [133]) (symbol_ref:SI ("mysym11") ) 
     (expr_list:REG_EQUIV (symbol_ref:SI ("mysym11") ...)
   ...

   (note 39 24 40 2 NOTE_INSN_EPILOGUE_BEG)

   (insn 40 39 41 2 (set (reg:SI 11 11)
        (plus:SI (reg/f:SI 31 31)
            (const_int 48 [0x30]))) ./t.c:22 -1
   ...
   (insn 43 42 44 2 (set (reg:SI 25 25)
        (mem/c:SI (plus:SI (reg:SI 11 11)
                (const_int -28 [0xffffffffffffffe4])) [0 S4 A8])) ./t.c:22 -1

   (insn/f 44 43 45 2 (set (reg/f:SI 31 31)
        (mem/c:SI (plus:SI (reg:SI 11 11)
                (const_int -4 [0xfffffffffffffffc])) [0 S4 A8])) ./t.c:22 -1
 
   (insn 45 44 46 2 (set (mem/c:BLK (reg/f:SI 1 1) [0 A8])
        (unspec:BLK [
                (mem/c:BLK (reg/f:SI 1 1) [0 A8])
            ] UNSPEC_TIE)) ./t.c:22 -1
 
    (insn/f 46 45 47 2 (set (reg/f:SI 1 1)
        (reg:SI 11 11)) ./t.c:22 -1


 insn 45 is the "stack tie" intended to prevent a move of the sp restore
 (insn 46) prior to the register restores accessing the frame (insn 43 here) 

 Most of the time, as you say, this works because the mem accesses are
 considered conflicting. In this very particular case,  we get into 
 base_alias_check with 

   x = (reg/f:SI 1 1)
   y = (plus:SI (reg:SI 11 11)
         (const_int -4 [0xfffffffffffffffc]))

 from which base_alias_check infers

   x_base = (address:SI (reg/f:SI 1 1))
   y_base = (symbol_ref:SI ("mysym11") ...)

 (Erm, _that_ doesn't sound right: r11 certainly is not a base access
  to mysym11 past insn 40)

 x_base and y_base are both != 0, and we get (still within base_alias_check)
 into
 
    /* If one address is a stack reference there can be no alias:
       stack references using different base registers do not alias,
       a stack reference can not alias a parameter, and a stack reference
       can not alias a global.  */

 Eventually, the tie and the sp restore move up together prior to the
 register restore. 

 rev 180522 changes the registers allocation so that r11 is not used in
 the function body any more.

 That r11 is considered a base to mysym11 certainly looks incorrect. Now,
 looking at the comment quoted above suggests that the current stack tie
 mechanism used in emit_epilogue (relying on mem r11 to always conflict with
 mem:blk sp) is optimistic, regardless, which Joesph's experience seem to
 confirm.

 Joseph resorted to mem:scratch to impose a strong barrier. That's certainly
 safe and I don't think the performance impact can be significant, so this
 looks like a good way out.

 I tried an approach in between here: replace the stack_tie insn by a
 frame_tie insn with frame_base_rtx as the second base. In the case above,
 with frame_base_rtx==r11, this is something like

  (set (mem/c:BLK (reg:SI 11 11) [0 A8])
        (unspec:BLK [
                (mem/c:BLK (reg:SI 11 11) [0 A8])
                (mem/c:BLK (reg/f:SI 1 1) [0 A8])

 As I mentioned upthread, I'm still unclear on a couple of aspects though.
 For example, ISTM that there are paths where we might need a tie and different
 frame base regs were used before. The conditions controlling the different
 paths are intricate so I'm not quite sure. Joseph's approach would for sure
 void the need to seek answers :)

 Olivier

--

char mysym11;
char * volatile g11;

void foo (long x)
{
  /* Force frame_pointer_needed & use of r11 as a frame_reg_rtx
     from emit_epilogue.  */
  char volatile s [x];
  s[0] = 12;

  /* Force a register save/restore.  */
  asm volatile ("" : : : "25");

  /* Try to force something like (set r11 (symref:mysym))
     in the function body.  */
  {
    register char * volatile p11 asm("11");

    p11 = &mysym11;
    g11 = p11;
  }  
}


Reply via email to