Re: [PATCH V3] x86: Enable separate shrink wrapping

Segher Boessenkool Fri, 04 Jul 2025 06:21:14 -0700

Hi!

On Fri, Jul 04, 2025 at 07:23:23AM +0000, Cui, Lili wrote:
> > > Initially, I looked at other architectures and disabled the hard frame
> > > pointer,
> > 
> > Like aarch?  Yeah I always wondered why they don't do it.  I decided that 
> > that
> > is because of their ABI and architecture stuff they can save and restore 
> > their
> > frame reg (r29) with the same insn as they use for the link reg (r30).  Of
> > course they could do code to do tradeoffs there, but apparently they did no
> > see the use for that, or perhaps from experience knew what way this would
> > fall in the end.
> > 
> 
> Loongarch/rs6000/riscv/aarch64 all disable HARD_FRAME_POINTER_REGNUM.


rs6000 does not *have* a hard frame pointer!

Generic parts of GCC require a frame pointer to exist, so when people
require -fno-omit-frame-pointer we dedicate some GPR to it.  Gotta do
something, eh!  It costs about 2% performance (on average, not worst
case!)

The other archs you mention copied their code from aarch.

> > > but after reconsidering, I realized your point makes sense. If the
> > > hard frame pointer were enabled,  we would typically emit push %rbp
> > > and mov %rsp, %rbp at the first of prologue,  there is no room for
> > > separate shrink wrap, but if the function itself also use rbp, there
> > > might be room for optimization,
> > 
> > Yup, when using a frame pointer (hard or otherwise, and a very bad plan
> > nowadays, a 1970's thing) you typically get the frame pointer established 
> > very
> > first thing, anything that touches the frame needs it after all!
> > 
> > But not all code accesses the frame, many early-out paths do not for
> > example.
> > 
> 
> Yes, currently we do shrink-wrap for the entire prologue (including the 
> HARD_FRAME_POINTER), it can solve some early return issues. But we can't do 
> separate-shrink-wrap for HARD_FRAME_POINTER, because HARD_FRAME_POINTER needs 
> to record rsp before rsp points to the bottom of stack. We have to put it at 
> the beginning of the prologue, and we have no chance to shrink it 
> individually.

I'm not sure what things you mean here.

In most ABIs (yours as well I think?) the frame of a function is pointed
to by the stack pointer at function entry, in normal functions.  "Happy
functions" :-)

You can set a pseudo to the stack pointer at function entry and then
(either or not) copy that to the frame pointer later, or let things be
optimised away.

> I removed these two lines of code and conducted a comparison test,  and found 
> that the binary unchanged. Unfortunately, I didn't identify any opportunities 
> for optimization, I think it's better to keep them. Not sure if there might 
> be any corner case issues.

For most archs and ABIs it is very beneficial to use
-fomit-frame-pointer, I thought that was true for x86 even?  There is
a special reg for it, sure, but you can use that reg as a general reg as
well, and that is way useful on an arch with so few registers :-)


Segher

Re: [PATCH V3] x86: Enable separate shrink wrapping

Reply via email to