https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79748
--- Comment #2 from Katsunori Kumatani <katsunori.kumatani at gmail dot com> --- I tried -O3 -fipa-ra on the following example code but it seems it doesn't do what I suggested: (I used inline asm to force it to use a callee-saved register, no other reason... just to demonstrate) #include <stdio.h> static __attribute__((noinline)) int foo(int a) { asm("incl %0":"+b"(a)); // use ebx just to demonstrate return a; } void bar(int x) { asm("incl %0":"+b"(x)); // in caller as well printf("%d", foo(x)); } And I get this (see comment): foo(int): pushq %rbx # saves rbx needlessly movl %edi, %ebx incl %ebx movl %ebx, %eax popq %rbx ret bar(int): pushq %rbx # because this already saves it and movl %edi, %ebx incl %ebx movl %ebx, %edi call foo(int) popq %rbx # restores it here... (ABI) movl %eax, %esi movl $.LC0, %edi xorl %eax, %eax jmp printf Since GCC knows that 'foo' is "internal" to the code (not externally visible, its address is not taken, and with LTO it can know even across translation units), it could optimize this without having to save/restore rbx in 'foo'. 'bar' does the right thing saving 'rbx', but 'foo' doesn't... therefore, this attribute would be applied to 'foo'. Now I know that such an optimization might not be easy to add -- that's why I did not ask for an optimization, but to use the *existing* interprocedural optimizations of GCC. That attribute would help with that, because then it won't save/restore rbx in 'foo' (due to the attribute). Note that GCC's ipa-ra does work good but it needs to have all registers "unsaved" for that. For example, if you use 'ecx' in 'bar', it will *not* spill it across the function 'foo' because it knows 'foo' does not modify / clobber 'rcx' at all. That's basically what I'd like for 'rbx' and other callee-saved registers. This attribute would simply give GCC more freedom in this situations. This is probably more useful for x86-64 because it has more callee-saved registers... and pushing/popping them everytime in a function has more implications than the 4 used in 32-bit (but of course I'm sure it can be added for i386 too, if it gets added I mean)... and this happens even if the caller doesn't necessarily require it. I suggested it because I figured it would be an easy and useful addition (it may be useful for more things than just this particular situation). It doesn't have the problems of other "non general purpose" registers not being saved, because those don't get saved anyway. (like no_caller_saved_registers suffers from, which is already added)