In regular mode (with register windows), the compiler can make use of three sequences to establish the frame: 1. a single 'save' instruction taking an immediate as decrement, 2. a 'save' instruction taking an immediate followed by an 'add' instruction taking another immediate as decrement, 3. a single 'save' instruction taking a register as decrement.
The first and third sequences are atomic, whereas the second isn't. Therefore it might possible for a %fp-based store instruction to be scheduled between the two frame instructions in the second sequence; if, additionally, the offset is close to the frame size, then the store might access the register window save area, which is volatile. The net effect might be as though the store wasn't issued at all (because it is quickly overwritten). The adverse effect seems to be a very rare event in practice, as I observed it for the first time a couple of days ago and AFAIK it was never reported to us for the SPARC. But the x86 port has a counter-measure for something similar. The attached patch implements a 'frame blockage', which is a memory blockage with a dependency on the stack pointer. This is sufficient to order the stack pointer decrement instructions and the subsequent stores. Tested on SPARC/Solaris, applied on mainline, 4.6 and 4.5 branches. 2011-12-16 Eric Botcazou <ebotca...@adacore.com> * config/sparc/sparc.md (UNSPEC_FRAME_BLOCKAGE): New constant. (frame_blockage): New expander. (frame_blockage<P:mode>): New instruction. * config/sparc/sparc.c (sparc_expand_prologue): When the sequence of instructions establishing the frame isn't atomic, emit frame blockage. -- Eric Botcazou
Index: config/sparc/sparc.md =================================================================== --- config/sparc/sparc.md (revision 182398) +++ config/sparc/sparc.md (working copy) @@ -28,6 +28,7 @@ (define_constants [(UNSPEC_MOVE_PIC 0) (UNSPEC_UPDATE_RETURN 1) (UNSPEC_LOAD_PCREL_SYM 2) + (UNSPEC_FRAME_BLOCKAGE 3) (UNSPEC_MOVE_PIC_LABEL 5) (UNSPEC_SETH44 6) (UNSPEC_SETM44 7) @@ -6374,6 +6375,25 @@ (define_insn "blockage" "" "" [(set_attr "length" "0")]) + +;; Do not schedule instructions accessing memory before this point. + +(define_expand "frame_blockage" + [(set (match_dup 0) + (unspec:BLK [(match_dup 1)] UNSPEC_FRAME_BLOCKAGE))] + "" +{ + operands[0] = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode)); + MEM_VOLATILE_P (operands[0]) = 1; + operands[1] = stack_pointer_rtx; +}) + +(define_insn "*frame_blockage<P:mode>" + [(set (match_operand:BLK 0 "" "") + (unspec:BLK [(match_operand:P 1 "" "")] UNSPEC_FRAME_BLOCKAGE))] + "" + "" + [(set_attr "length" "0")]) (define_expand "probe_stack" [(set (match_operand 0 "memory_operand" "") (const_int 0))] Index: config/sparc/sparc.c =================================================================== --- config/sparc/sparc.c (revision 182398) +++ config/sparc/sparc.c (working copy) @@ -4972,8 +4972,9 @@ sparc_expand_prologue (void) else if (size <= 8192) { insn = emit_insn (gen_stack_pointer_inc (GEN_INT (-4096))); - /* %sp is still the CFA register. */ RTX_FRAME_RELATED_P (insn) = 1; + + /* %sp is still the CFA register. */ insn = emit_insn (gen_stack_pointer_inc (GEN_INT (4096 - size))); } else @@ -4996,8 +4997,18 @@ sparc_expand_prologue (void) else if (size <= 8192) { emit_window_save (GEN_INT (-4096)); + /* %sp is not the CFA register anymore. */ emit_insn (gen_stack_pointer_inc (GEN_INT (4096 - size))); + + /* Make sure no %fp-based store is issued until after the frame is + established. The offset between the frame pointer and the stack + pointer is calculated relative to the value of the stack pointer + at the end of the function prologue, and moving instructions that + access the stack via the frame pointer between the instructions + that decrement the stack pointer could result in accessing the + register window save area, which is volatile. */ + emit_insn (gen_frame_blockage ()); } else {