> -----Original Message-----
> From: Andrew Pinski (QUIC) <quic_apin...@quicinc.com>
> Sent: Wednesday, June 25, 2025 4:20 PM
> To: Andrew Pinski (QUIC) <quic_apin...@quicinc.com>; gcc-
> patc...@gcc.gnu.org
> Subject: RE: [PATCH] expand: Allow sibcalling for return
> structures in some cases [PR71761]
> 
> > -----Original Message-----
> > From: Andrew Pinski (QUIC) <quic_apin...@quicinc.com>
> > Sent: Monday, June 23, 2025 11:39 AM
> > To: Andrew Pinski (QUIC) <quic_apin...@quicinc.com>; gcc-
> > patc...@gcc.gnu.org
> > Subject: RE: [PATCH] expand: Allow sibcalling for return
> structures in
> > some cases [PR71761]
> >
> > > -----Original Message-----
> > > From: Andrew Pinski (QUIC) <quic_apin...@quicinc.com>
> > > Sent: Monday, June 23, 2025 8:01 AM
> > > To: gcc-patches@gcc.gnu.org
> > > Cc: Andrew Pinski (QUIC) <quic_apin...@quicinc.com>
> > > Subject: [PATCH] expand: Allow sibcalling for return
> > structures in
> > > some cases [PR71761]
> > >
> > > In the case of tailing call with a return of a structure,
> > currently
> > > all large structures are rejected. We can allow the case
> were
> > the
> > > return of the "tail call" function is setting the return value
> of
> > the
> > > current function.  This allows for the musttail that is
> located
> > in
> > > pr71761-1.c.
> > >
> > > This should be safe as sibcalls here is RSO is always set for
> > this
> > > case so the current function no longer owns the space.
> > >
> > > Bootstrapped and tested on x86_64-linux-gnu with no
> > regressions.
> >
> > Looks like there is some issues with aarch64 which I will
> need to look
> > into to see what is going wrong. Bootstrap fails on aarch64;
> looks
> > like some wrong code happening while compiling gcc itself.
> >
> > So right for now, this patch is withdrawn while I figure out
> what is
> > going wrong.
> 
> Just a quick update on this and writing notes up on it. The
> following C parser functions seems to be mis-compiling:
> c_parser_cast_expression
> c_parser_unary_expression
> 
> c_parser_cast_expression tail calls into
> c_parser_unary_expression and c_parser_unary_expression
> tail calls into c_parser_postfix_expression
> 
> This is for the default cases in both.
> I had looked at the resulting assembly and I could not spot
> what was going wrong at this point but still look further.

Some good news I have got a reduced testcase:
```
typedef struct token
{
    char const* tok_start;
    char const* tok_end;
    int tok_type;
    unsigned identifier_hash;
}token;

[[gnu::noinline,gnu::noipa]]
token f()
{
  return (token){};
}
[[gnu::noinline,gnu::noipa]]
void gh()
{
  // Clobber the register containing the struct address
  asm("mov x8, 0":::"x8");
}

[[gnu::noinline,gnu::noipa]]
token g(int t)
{
  if (t)
    gh();
  [[gnu::musttail]]
  return f();
}

int main()
{
  token t = g(1);
  return 0;
} ```
The problem is we don't get the correct value for x8 for the call to f.
Now I need to see where x8 gets set in the backend for the function call. Maybe 
there is a path which is wrong for tail calls.

Thanks,
Andrew 


> 
> 
> >
> > Thanks,
> > Andrew
> >
> > >
> > >   PR middle-end/71761
> > >
> > > gcc/ChangeLog:
> > >
> > >   * calls.cc (can_implement_as_sibling_call_p): Don't reject
> > >   "structured" returns if the addr is the same as the current
> > >   result decl's memory.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   * c-c++-common/pr71761-1.c: New test.
> > >
> > > Signed-off-by: Andrew Pinski <quic_apin...@quicinc.com>
> > > ---
> > >  gcc/calls.cc                           | 23 +++++++++++++++++------
> > >  gcc/testsuite/c-c++-common/pr71761-1.c | 17
> > > +++++++++++++++++
> > >  2 files changed, 34 insertions(+), 6 deletions(-)  create
> mode
> > > 100644 gcc/testsuite/c-c++-common/pr71761-1.c
> > >
> > > diff --git a/gcc/calls.cc b/gcc/calls.cc index
> > > 1dcda20a06e..d1f3e17591f 100644
> > > --- a/gcc/calls.cc
> > > +++ b/gcc/calls.cc
> > > @@ -2575,14 +2575,25 @@
> > can_implement_as_sibling_call_p (tree exp,
> > >        return false;
> > >      }
> > >
> > > -  /* Doing sibling call optimization needs some work, since
> > > -     structure_value_addr can be allocated on the stack.
> > > -     It does not seem worth the effort since few optimizable
> > > -     sibling calls will return a structure.  */
> > > +  /* For sibling call optimization with a structure return,
> > > +     the return value needs to be the same as the current
> > > function,
> > > +     otherwise the structure return will be allocated in the
> > > stack.  */
> > >    if (structure_value_addr != NULL_RTX)
> > >      {
> > > -      maybe_complain_about_tail_call (exp, _("callee
> returns
> > a
> > > structure"));
> > > -      return false;
> > > +      tree res = DECL_RESULT (current_function_decl);
> > > +      bool ok = false;
> > > +      if (res && DECL_RTL_SET_P (res))
> > > + {
> > > +   rtx res_rtx = DECL_RTL (res);
> > > +   if (MEM_P (res_rtx)
> > > +       && XEXP (res_rtx, 0) == structure_value_addr)
> > > +     ok = true;
> > > + }
> > > +      if (!ok)
> > > + {
> > > +   maybe_complain_about_tail_call (exp, _("callee
> returns a
> > > structure"));
> > > +   return false;
> > > + }
> > >      }
> > >
> > >    /* Check whether the target is able to optimize the call
> diff
> > --
> > > git a/gcc/testsuite/c-c++-common/pr71761-1.c
> > > b/gcc/testsuite/c-c++-common/pr71761-1.c
> > > new file mode 100644
> > > index 00000000000..41f4f834fc7
> > > --- /dev/null
> > > +++ b/gcc/testsuite/c-c++-common/pr71761-1.c
> > > @@ -0,0 +1,17 @@
> > > +/* PR middle-end/71761 */
> > > +/* { dg-do compile { target musttail } } */
> > > +/* { dg-options "-O2" } */
> > > +
> > > +typedef struct Foo {
> > > +   int o[16];
> > > +}Foo;
> > > +
> > > +Foo moo();
> > > +
> > > +Foo goo()
> > > +{
> > > +  [[gnu::musttail]]
> > > +  return moo();
> > > +}
> > > +
> > > +/* { dg-final { scan-assembler "jmp      moo" { target { { i?86-
> > *-*
> > > x86_64-*-* } && { ! ia32 } } } } } */
> > > --
> > > 2.34.1
> >
> 

Reply via email to