> -----Original Message-----
> From: Segher Boessenkool <seg...@kernel.crashing.org>
> Sent: Friday, July 4, 2025 9:21 PM
> To: Cui, Lili <lili....@intel.com>
> Cc: ubiz...@gmail.com; gcc-patches@gcc.gnu.org; Liu, Hongtao
> <hongtao....@intel.com>; richard.guent...@gmail.com; Michael Matz
> <m...@suse.de>
> Subject: Re: [PATCH V3] x86: Enable separate shrink wrapping
>
> Hi!
>
> On Fri, Jul 04, 2025 at 07:23:23AM +0000, Cui, Lili wrote:
> > > > Initially, I looked at other architectures and disabled the hard
> > > > frame pointer,
> > >
> > > Like aarch? Yeah I always wondered why they don't do it. I decided
> > > that that is because of their ABI and architecture stuff they can
> > > save and restore their frame reg (r29) with the same insn as they
> > > use for the link reg (r30). Of course they could do code to do
> > > tradeoffs there, but apparently they did no see the use for that, or
> > > perhaps from experience knew what way this would fall in the end.
> > >
> >
> > Loongarch/rs6000/riscv/aarch64 all disable
> HARD_FRAME_POINTER_REGNUM.
>
> rs6000 does not *have* a hard frame pointer!
>
Oh, I see. The handling of HARD_FRAME_POINTER_REGNUM seems redundant for
rs6000.
rs6000_get_separate_components (void)
{
...
/* Don't mess with the hard frame pointer. */
if (frame_pointer_needed)
bitmap_clear_bit (components, HARD_FRAME_POINTER_REGNUM);
...
> Generic parts of GCC require a frame pointer to exist, so when people require
> -fno-omit-frame-pointer we dedicate some GPR to it. Gotta do something,
> eh! It costs about 2% performance (on average, not worst
> case!)
>
> The other archs you mention copied their code from aarch.
>
> > > > but after reconsidering, I realized your point makes sense. If the
> > > > hard frame pointer were enabled, we would typically emit push
> > > > %rbp and mov %rsp, %rbp at the first of prologue, there is no
> > > > room for separate shrink wrap, but if the function itself also use
> > > > rbp, there might be room for optimization,
> > >
> > > Yup, when using a frame pointer (hard or otherwise, and a very bad
> > > plan nowadays, a 1970's thing) you typically get the frame pointer
> > > established very first thing, anything that touches the frame needs it
> > > after
> all!
> > >
> > > But not all code accesses the frame, many early-out paths do not for
> > > example.
> > >
> >
> > Yes, currently we do shrink-wrap for the entire prologue (including the
> HARD_FRAME_POINTER), it can solve some early return issues. But we can't
> do separate-shrink-wrap for HARD_FRAME_POINTER, because
> HARD_FRAME_POINTER needs to record rsp before rsp points to the bottom
> of stack. We have to put it at the beginning of the prologue, and we have no
> chance to shrink it individually.
>
> I'm not sure what things you mean here.
>
> In most ABIs (yours as well I think?) the frame of a function is pointed to by
> the stack pointer at function entry, in normal functions. "Happy functions"
> :-)
>
Yes, in a normal function it would be placed at entry bb, but shrink-wrap might
move the whole prologue to after early return. X86 puts the frame pointer and
the prologue together. For some early return situations, shrink-wrap can avoid
the frame point being executed.
> You can set a pseudo to the stack pointer at function entry and then (either
> or
> not) copy that to the frame pointer later, or let things be optimised away.
>
> > I removed these two lines of code and conducted a comparison test, and
> found that the binary unchanged. Unfortunately, I didn't identify any
> opportunities for optimization, I think it's better to keep them. Not sure if
> there might be any corner case issues.
>
> For most archs and ABIs it is very beneficial to use -fomit-frame-pointer, I
> thought that was true for x86 even? There is a special reg for it, sure, but
> you
> can use that reg as a general reg as well, and that is way useful on an arch
> with so few registers :-)
Yes, -fomit-frame-pointer does help performance. Here is a simple small case
https://godbolt.org/z/5Tc3jM7qc . Do you mean to optimize the %rbp here?
Thanks,
Lili.
>
> Segher