AIUI: (1) The main point of http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01172.html was that we had:
emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx); /* This might change the hard frame pointer in ways that aren't apparent to early optimization passes, so force a clobber. */ emit_clobber (hard_frame_pointer_rtx); and, after elimination, the clobber killed the assignment to the hard frame pointer. Which it could. :-) (2) Aassuming for simplicity that STARTING_FRAME_OFFSET == 0, we end up with an assignment like: (set frame_pointer_rtx hard_frame_pointer_rtx) And the problem is that frame_pointer_rtx often isn't a real register: it gets eliminated to hard_frame_pointer_rtx+X, where X isn't known until RA time. I.e. the assignment is really: (set (plus hard_frame_pointer_rtx X) hard_frame_pointer_rtx) which becomes: (set hard_frame_pointer_rtx (plus hard_frame_pointer_rtx -X)) Before RA it isn't obvious that the assignment clobbers hard_frame_pointer_rtx, so it would seem to most passes that frame_pointer_rtx and hard_frame_pointer_rtx are equivalent after the set. That was why the clobber was there. (3) We chose to fix this by deleting the explicit clobber and relying on the magicness of unspec_volatile to imply the clobber instead. But I don't think (3) is a good idea. We should instead fix the dataflow so that it's accurate. Long term it would be good to have a different representation for (2), but I don't have any bright ideas what that might be. Until then I think we can model the dataflow accurately by having a use as well as a clobber of hard_frame_pointer_rtx. I went back to r192682: http://gcc.gnu.org/ml/gcc-testresults/2012-10/msg02350.html and saw the -m32 tests fail without the builtins.c part of the patch. They passed after it. I then went to r200643: 2013-07-03 Hans-Peter Nilsson <h...@bitrange.com> PR middle-end/55030 * stmt.c (expand_nl_goto_receiver): Remove almost-copy of expand_builtin_setjmp_receiver. (expand_label): Adjust, call expand_builtin_setjmp_receiver with NULL for the label parameter. * builtins.c (expand_builtin_setjmp_receiver): Don't clobber the frame-pointer. Adjust comments. [HAVE_builtin_setjmp_receiver]: Emit builtin_setjmp_receiver only if LABEL is non-NULL. and applied the full patch. The tests still passed, despite the removal of the volatile checks. (To recap, the volatile checks touched here were the same ones touched by: http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00868.html Rather than reverting that part I'm removing the check entirely, since we seem to agree that the original asm handling wasn't necessary.) I'll run a full test overnight, but does this look like it might be a way out, at least for 4.9? Thanks, Richard gcc/ * builtins.c (expand_builtin_setjmp_receiver): Use and clobber hard_frame_pointer_rtx. * cse.c (cse_insn): Remove volatile check. * cselib.c (cselib_process_insn): Likewise. * dse.c (scan_insn): Likewise. Index: gcc/builtins.c =================================================================== --- gcc/builtins.c 2014-03-03 21:47:59.749026019 +0000 +++ gcc/builtins.c 2014-03-03 21:48:00.550030853 +0000 @@ -910,18 +910,27 @@ expand_builtin_setjmp_receiver (rtx rece #ifdef HAVE_nonlocal_goto if (! HAVE_nonlocal_goto) #endif - /* First adjust our frame pointer to its actual value. It was - previously set to the start of the virtual area corresponding to - the stacked variables when we branched here and now needs to be - adjusted to the actual hardware fp value. - - Assignments to virtual registers are converted by - instantiate_virtual_regs into the corresponding assignment - to the underlying register (fp in this case) that makes - the original assignment true. - So the following insn will actually be decrementing fp by - STARTING_FRAME_OFFSET. */ - emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx); + { + /* First adjust our frame pointer to its actual value. It was + previously set to the start of the virtual area corresponding to + the stacked variables when we branched here and now needs to be + adjusted to the actual hardware fp value. + + Assignments to virtual registers are converted by + instantiate_virtual_regs into the corresponding assignment + to the underlying register (fp in this case) that makes + the original assignment true. + So the following insn will actually be decrementing fp by + STARTING_FRAME_OFFSET. */ + emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx); + + /* Restoring the frame pointer also modifies the hard frame pointer. + Mark it used (so that the previous assignment remains live once + the frame pointer is eliminated) and clobbered (to represent the + implicit update from the assignment). */ + emit_use (hard_frame_pointer_rtx); + emit_clobber (hard_frame_pointer_rtx); + } #if !HARD_FRAME_POINTER_IS_ARG_POINTER if (fixed_regs[ARG_POINTER_REGNUM]) @@ -965,8 +974,7 @@ expand_builtin_setjmp_receiver (rtx rece /* We must not allow the code we just generated to be reordered by scheduling. Specifically, the update of the frame pointer must - happen immediately, not later. Similarly, we must block - (frame-related) register values to be used across this code. */ + happen immediately, not later. */ emit_insn (gen_blockage ()); } Index: gcc/cse.c =================================================================== --- gcc/cse.c 2014-03-03 21:47:59.869026741 +0000 +++ gcc/cse.c 2014-03-03 21:48:00.625031305 +0000 @@ -5664,11 +5664,6 @@ cse_insn (rtx insn) invalidate (XEXP (dest, 0), GET_MODE (dest)); } - /* A volatile ASM or an UNSPEC_VOLATILE invalidates everything. */ - if (NONJUMP_INSN_P (insn) - && volatile_insn_p (PATTERN (insn))) - flush_hash_table (); - /* Don't cse over a call to setjmp; on some machines (eg VAX) the regs restored by the longjmp come from a later time than the setjmp. */ Index: gcc/cselib.c =================================================================== --- gcc/cselib.c 2014-03-03 21:47:59.870026748 +0000 +++ gcc/cselib.c 2014-03-03 21:48:00.626031311 +0000 @@ -2629,9 +2629,7 @@ cselib_process_insn (rtx insn) /* Forget everything at a CODE_LABEL, a volatile insn, or a setjmp. */ if ((LABEL_P (insn) || (CALL_P (insn) - && find_reg_note (insn, REG_SETJMP, NULL)) - || (NONJUMP_INSN_P (insn) - && volatile_insn_p (PATTERN (insn)))) + && find_reg_note (insn, REG_SETJMP, NULL))) && !cselib_preserve_constants) { cselib_reset_table (next_uid); Index: gcc/dse.c =================================================================== --- gcc/dse.c 2014-03-03 21:47:59.871026754 +0000 +++ gcc/dse.c 2014-03-03 21:48:00.627031317 +0000 @@ -2470,16 +2470,6 @@ scan_insn (bb_info_t bb_info, rtx insn) return; } - /* Cselib clears the table for this case, so we have to essentially - do the same. */ - if (NONJUMP_INSN_P (insn) - && volatile_insn_p (PATTERN (insn))) - { - add_wild_read (bb_info); - insn_info->cannot_delete = true; - return; - } - /* Look at all of the uses in the insn. */ note_uses (&PATTERN (insn), check_mem_read_use, bb_info);