http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46685
--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> 2010-11-29 15:17:30 UTC --- I guess we could do something like: --- varasm.c.jj 2010-11-29 12:39:07.000000000 +0100 +++ varasm.c 2010-11-29 15:15:53.000000000 +0100 @@ -534,6 +534,15 @@ section * default_function_section (tree decl, enum node_frequency freq, bool startup, bool exit) { + /* Force nested functions into the same section as the containing + function. */ + if (decl + && DECL_SECTION_NAME (decl) == NULL_TREE + && DECL_CONTEXT (decl) != NULL_TREE + && TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL + && DECL_SECTION_NAME (DECL_CONTEXT (decl)) == NULL_TREE) + return function_section (DECL_CONTEXT (decl)); + /* Startup code should go to startup subsection unless it is unlikely executed (this happens especially with function splitting where we can split away unnecesary parts of static constructors. */ or perhaps instead of changing default_function_section write a wrapper around it and use it on targets that need it. This will use the same section as the containing function for nested functions. On the other side, the above is still broken with -freorder-blocks-and-partition and I guess even before Honza's change it has been broken. Alternatively, we could do something like: --- gcc/config/sparc/sparc.c.jj 2010-11-26 18:39:04.000000000 +0100 +++ gcc/config/sparc/sparc.c 2010-11-29 15:35:00.727219374 +0100 @@ -1066,8 +1066,13 @@ sparc_expand_move (enum machine_mode mod are absolutely sure that X is in the same segment as the GOT. Unfortunately, the flexibility of linker scripts means that we can't be sure of that in general, so assume that _G_O_T_-relative - accesses are never valid on VxWorks. */ - if (GET_CODE (operands[1]) == LABEL_REF && !TARGET_VXWORKS_RTP) + accesses are never valid on VxWorks. + If the label is non-local, it might be placed in a different section + from . and movMODE_pic_label_ref patterns require the label and . + to be in the same section. */ + if (GET_CODE (operands[1]) == LABEL_REF + && !TARGET_VXWORKS_RTP + && !LABEL_REF_NONLOCAL_P (operands[1])) { if (mode == SImode) { Not sure if even in the current function it could happen that sparc_expand_move is asked for a label that is in the other partition. I mean something like: __attribute__((noinline, noclone)) void bar (void *x) { asm volatile ("" : : "r" (x) : "memory"); } __attribute__((noinline, noclone)) void baz (void) { asm volatile ("" : : : "memory"); } __attribute__((noinline, noclone)) int foo (int x) { __label__ lab; if (__builtin_expect (x, 0)) { lab: baz (); return 2; } bar (&&lab); return 1; } int main (void) { int x, i; asm volatile ("" : "=r" (x) : "0" (0)); for (i = 0; i < 1000000; i++) foo (x); return 0; } first compiled/linked with -O2 -fprofile-generate -freorder-blocks-and-partition -fpic, then executed, then compiled again with -O2 -fprofile-use -freorder-blocks-and-partition -fpic. At least on x86_64-linux the baz () bb is .text.unlikely, while bar (&&lab) is .text section. Now, I guess this wouldn't assemble on sparc-linux or Solaris, even before Honza's patch. In that case we perhaps could look at LABEL_REF's operand (if any, nonlocal labels probably don't have them) and look at bb flags of its INSN_BLOCK. Unfortunately, the partitions are only computed during *.bbpart, so this is not known at expansion time (and in sparc_expand_move we might not even know in which bb we are going to end up). So, perhaps we'd just need some splitter for these insns and change them to something else if the current bb and target LABEL_REF have different partition. Or simply ammend the second patch above and disable this optimization even for flag_reorder_blocks_and_partition.