On Nov 8, 2011, at 6:13 AM, Alan Modra wrote: > It's been a while since I looked at what was happening with this > testcase, but from memory what stops sheduling over the stack_tie is > alias.c:base_alias_check. Which relies on tracking registers to see > whether two memory accesses using different regs could alias. Quite > possibly gcc-4.4 is deficient in this area.
I re-investigated our testcase yesterday and this is indeed what makes the difference. I couldn't reproduce the misbehavior on mainline, not because of the recent change you mentioned upthread, but because of an earlier one, rev 180522. The problem is visible on 180521, and here is what is happening (roughly): The testcase (sources below), compiled with -O1 -fschedule-insns2 by a powerpc-wrs-vxworks compiler produces in .pro_and_epilogue: (insn 29 22 28 2 (set (reg/f:SI 11 11 [133]) (high:SI (symbol_ref:SI ("mysym11") ...))) ./t.c:19 408 {elf_high} (expr_list:REG_EQUIV (high:SI (symbol_ref:SI ("mysym11") ...) (insn 28 29 24 2 (set (reg/f:SI 0 0 [orig:123 p11$3 ] [123]) (lo_sum:SI (reg/f:SI 11 11 [133]) (symbol_ref:SI ("mysym11") ) (expr_list:REG_EQUIV (symbol_ref:SI ("mysym11") ...) ... (note 39 24 40 2 NOTE_INSN_EPILOGUE_BEG) (insn 40 39 41 2 (set (reg:SI 11 11) (plus:SI (reg/f:SI 31 31) (const_int 48 [0x30]))) ./t.c:22 -1 ... (insn 43 42 44 2 (set (reg:SI 25 25) (mem/c:SI (plus:SI (reg:SI 11 11) (const_int -28 [0xffffffffffffffe4])) [0 S4 A8])) ./t.c:22 -1 (insn/f 44 43 45 2 (set (reg/f:SI 31 31) (mem/c:SI (plus:SI (reg:SI 11 11) (const_int -4 [0xfffffffffffffffc])) [0 S4 A8])) ./t.c:22 -1 (insn 45 44 46 2 (set (mem/c:BLK (reg/f:SI 1 1) [0 A8]) (unspec:BLK [ (mem/c:BLK (reg/f:SI 1 1) [0 A8]) ] UNSPEC_TIE)) ./t.c:22 -1 (insn/f 46 45 47 2 (set (reg/f:SI 1 1) (reg:SI 11 11)) ./t.c:22 -1 insn 45 is the "stack tie" intended to prevent a move of the sp restore (insn 46) prior to the register restores accessing the frame (insn 43 here) Most of the time, as you say, this works because the mem accesses are considered conflicting. In this very particular case, we get into base_alias_check with x = (reg/f:SI 1 1) y = (plus:SI (reg:SI 11 11) (const_int -4 [0xfffffffffffffffc])) from which base_alias_check infers x_base = (address:SI (reg/f:SI 1 1)) y_base = (symbol_ref:SI ("mysym11") ...) (Erm, _that_ doesn't sound right: r11 certainly is not a base access to mysym11 past insn 40) x_base and y_base are both != 0, and we get (still within base_alias_check) into /* If one address is a stack reference there can be no alias: stack references using different base registers do not alias, a stack reference can not alias a parameter, and a stack reference can not alias a global. */ Eventually, the tie and the sp restore move up together prior to the register restore. rev 180522 changes the registers allocation so that r11 is not used in the function body any more. That r11 is considered a base to mysym11 certainly looks incorrect. Now, looking at the comment quoted above suggests that the current stack tie mechanism used in emit_epilogue (relying on mem r11 to always conflict with mem:blk sp) is optimistic, regardless, which Joesph's experience seem to confirm. Joseph resorted to mem:scratch to impose a strong barrier. That's certainly safe and I don't think the performance impact can be significant, so this looks like a good way out. I tried an approach in between here: replace the stack_tie insn by a frame_tie insn with frame_base_rtx as the second base. In the case above, with frame_base_rtx==r11, this is something like (set (mem/c:BLK (reg:SI 11 11) [0 A8]) (unspec:BLK [ (mem/c:BLK (reg:SI 11 11) [0 A8]) (mem/c:BLK (reg/f:SI 1 1) [0 A8]) As I mentioned upthread, I'm still unclear on a couple of aspects though. For example, ISTM that there are paths where we might need a tie and different frame base regs were used before. The conditions controlling the different paths are intricate so I'm not quite sure. Joseph's approach would for sure void the need to seek answers :) Olivier -- char mysym11; char * volatile g11; void foo (long x) { /* Force frame_pointer_needed & use of r11 as a frame_reg_rtx from emit_epilogue. */ char volatile s [x]; s[0] = 12; /* Force a register save/restore. */ asm volatile ("" : : : "25"); /* Try to force something like (set r11 (symref:mysym)) in the function body. */ { register char * volatile p11 asm("11"); p11 = &mysym11; g11 = p11; } }