> -----Original Message----- > From: Andrew Pinski (QUIC) <quic_apin...@quicinc.com> > Sent: Wednesday, June 25, 2025 4:20 PM > To: Andrew Pinski (QUIC) <quic_apin...@quicinc.com>; gcc- > patc...@gcc.gnu.org > Subject: RE: [PATCH] expand: Allow sibcalling for return > structures in some cases [PR71761] > > > -----Original Message----- > > From: Andrew Pinski (QUIC) <quic_apin...@quicinc.com> > > Sent: Monday, June 23, 2025 11:39 AM > > To: Andrew Pinski (QUIC) <quic_apin...@quicinc.com>; gcc- > > patc...@gcc.gnu.org > > Subject: RE: [PATCH] expand: Allow sibcalling for return > structures in > > some cases [PR71761] > > > > > -----Original Message----- > > > From: Andrew Pinski (QUIC) <quic_apin...@quicinc.com> > > > Sent: Monday, June 23, 2025 8:01 AM > > > To: gcc-patches@gcc.gnu.org > > > Cc: Andrew Pinski (QUIC) <quic_apin...@quicinc.com> > > > Subject: [PATCH] expand: Allow sibcalling for return > > structures in > > > some cases [PR71761] > > > > > > In the case of tailing call with a return of a structure, > > currently > > > all large structures are rejected. We can allow the case > were > > the > > > return of the "tail call" function is setting the return value > of > > the > > > current function. This allows for the musttail that is > located > > in > > > pr71761-1.c. > > > > > > This should be safe as sibcalls here is RSO is always set for > > this > > > case so the current function no longer owns the space. > > > > > > Bootstrapped and tested on x86_64-linux-gnu with no > > regressions. > > > > Looks like there is some issues with aarch64 which I will > need to look > > into to see what is going wrong. Bootstrap fails on aarch64; > looks > > like some wrong code happening while compiling gcc itself. > > > > So right for now, this patch is withdrawn while I figure out > what is > > going wrong. > > Just a quick update on this and writing notes up on it. The > following C parser functions seems to be mis-compiling: > c_parser_cast_expression > c_parser_unary_expression > > c_parser_cast_expression tail calls into > c_parser_unary_expression and c_parser_unary_expression > tail calls into c_parser_postfix_expression > > This is for the default cases in both. > I had looked at the resulting assembly and I could not spot > what was going wrong at this point but still look further.
Some good news I have got a reduced testcase: ``` typedef struct token { char const* tok_start; char const* tok_end; int tok_type; unsigned identifier_hash; }token; [[gnu::noinline,gnu::noipa]] token f() { return (token){}; } [[gnu::noinline,gnu::noipa]] void gh() { // Clobber the register containing the struct address asm("mov x8, 0":::"x8"); } [[gnu::noinline,gnu::noipa]] token g(int t) { if (t) gh(); [[gnu::musttail]] return f(); } int main() { token t = g(1); return 0; } ``` The problem is we don't get the correct value for x8 for the call to f. Now I need to see where x8 gets set in the backend for the function call. Maybe there is a path which is wrong for tail calls. Thanks, Andrew > > > > > > Thanks, > > Andrew > > > > > > > > PR middle-end/71761 > > > > > > gcc/ChangeLog: > > > > > > * calls.cc (can_implement_as_sibling_call_p): Don't reject > > > "structured" returns if the addr is the same as the current > > > result decl's memory. > > > > > > gcc/testsuite/ChangeLog: > > > > > > * c-c++-common/pr71761-1.c: New test. > > > > > > Signed-off-by: Andrew Pinski <quic_apin...@quicinc.com> > > > --- > > > gcc/calls.cc | 23 +++++++++++++++++------ > > > gcc/testsuite/c-c++-common/pr71761-1.c | 17 > > > +++++++++++++++++ > > > 2 files changed, 34 insertions(+), 6 deletions(-) create > mode > > > 100644 gcc/testsuite/c-c++-common/pr71761-1.c > > > > > > diff --git a/gcc/calls.cc b/gcc/calls.cc index > > > 1dcda20a06e..d1f3e17591f 100644 > > > --- a/gcc/calls.cc > > > +++ b/gcc/calls.cc > > > @@ -2575,14 +2575,25 @@ > > can_implement_as_sibling_call_p (tree exp, > > > return false; > > > } > > > > > > - /* Doing sibling call optimization needs some work, since > > > - structure_value_addr can be allocated on the stack. > > > - It does not seem worth the effort since few optimizable > > > - sibling calls will return a structure. */ > > > + /* For sibling call optimization with a structure return, > > > + the return value needs to be the same as the current > > > function, > > > + otherwise the structure return will be allocated in the > > > stack. */ > > > if (structure_value_addr != NULL_RTX) > > > { > > > - maybe_complain_about_tail_call (exp, _("callee > returns > > a > > > structure")); > > > - return false; > > > + tree res = DECL_RESULT (current_function_decl); > > > + bool ok = false; > > > + if (res && DECL_RTL_SET_P (res)) > > > + { > > > + rtx res_rtx = DECL_RTL (res); > > > + if (MEM_P (res_rtx) > > > + && XEXP (res_rtx, 0) == structure_value_addr) > > > + ok = true; > > > + } > > > + if (!ok) > > > + { > > > + maybe_complain_about_tail_call (exp, _("callee > returns a > > > structure")); > > > + return false; > > > + } > > > } > > > > > > /* Check whether the target is able to optimize the call > diff > > -- > > > git a/gcc/testsuite/c-c++-common/pr71761-1.c > > > b/gcc/testsuite/c-c++-common/pr71761-1.c > > > new file mode 100644 > > > index 00000000000..41f4f834fc7 > > > --- /dev/null > > > +++ b/gcc/testsuite/c-c++-common/pr71761-1.c > > > @@ -0,0 +1,17 @@ > > > +/* PR middle-end/71761 */ > > > +/* { dg-do compile { target musttail } } */ > > > +/* { dg-options "-O2" } */ > > > + > > > +typedef struct Foo { > > > + int o[16]; > > > +}Foo; > > > + > > > +Foo moo(); > > > + > > > +Foo goo() > > > +{ > > > + [[gnu::musttail]] > > > + return moo(); > > > +} > > > + > > > +/* { dg-final { scan-assembler "jmp moo" { target { { i?86- > > *-* > > > x86_64-*-* } && { ! ia32 } } } } } */ > > > -- > > > 2.34.1 > > >