On 1 December 2016 at 17:40, Richard Biener <rguent...@suse.de> wrote: > On Thu, 1 Dec 2016, Prathamesh Kulkarni wrote: > >> On 25 November 2016 at 21:17, Jeff Law <l...@redhat.com> wrote: >> > On 11/25/2016 01:07 AM, Richard Biener wrote: >> > >> >>> For the tail-call, issue should we artificially create a lhs and use that >> >>> as return value (perhaps by a separate pass before tailcall) ? >> >>> >> >>> __builtin_memcpy (a1, a2, a3); >> >>> return a1; >> >>> >> >>> gets transformed to: >> >>> _1 = __builtin_memcpy (a1, a2, a3) >> >>> return _1; >> >>> >> >>> So tail-call optimization pass would see the IL in it's expected form. >> >> >> >> >> >> As said, a RTL expert needs to chime in here. Iff then tail-call >> >> itself should do this rewrite. But if this form is required to make >> >> things work (I suppose you checked it _does_ actually work?) then >> >> we'd need to make sure later passes do not undo it. So it looks >> >> fragile to me. OTOH I seem to remember that the flags we set on >> >> GIMPLE are merely a hint to RTL expansion and the tailcalling is >> >> verified again there? >> > >> > So tail calling actually sits on the border between trees and RTL. >> > Essentially it's an expand-time decision as we use information from trees >> > as >> > well as low level target information. >> > >> > I would not expect the former sequence to tail call. The tail calling code >> > does not know that the return value from memcpy will be a1. Thus the tail >> > calling code has to assume that it'll have to copy a1 into the return >> > register after returning from memcpy, which obviously can't be done if we >> > tail called memcpy. >> > >> > The second form is much more likely to turn into a tail call sequence >> > because the return value from memcpy will be sitting in the proper >> > register. >> > This form out to work for most calling conventions that allow tail calls. >> > >> > We could (in theory) try and exploit the fact that memcpy returns its first >> > argument as a return value, but that would only be helpful on a target >> > where >> > the first argument and return value use the same register. So I'd have a >> > slight preference to rewriting per Prathamesh's suggestion above since it's >> > more general. >> Thanks for the suggestion. The attached patch creates artificial lhs, >> and returns it if the function returns it's argument and that argument >> is used as return-value. >> >> eg: >> f (void * a1, void * a2, long unsigned int a3) >> { >> <bb 2> [0.0%]: >> # .MEM_5 = VDEF <.MEM_1(D)> >> __builtin_memcpy (a1_2(D), a2_3(D), a3_4(D)); >> # VUSE <.MEM_5> >> return a1_2(D); >> >> } >> >> is transformed to: >> f (void * a1, void * a2, long unsigned int a3) >> { >> void * _6; >> >> <bb 2> [0.0%]: >> # .MEM_5 = VDEF <.MEM_1(D)> >> _6 = __builtin_memcpy (a1_2(D), a2_3(D), a3_4(D)); >> # VUSE <.MEM_5> >> return _6; >> >> } >> >> While testing, I came across an issue with function f() defined >> intail-padding1.C: >> struct X >> { >> ~X() {} >> int n; >> char d; >> }; >> >> X f() >> { >> X nrvo; >> __builtin_memset (&nrvo, 0, sizeof(X)); >> return nrvo; >> } >> >> input to the pass: >> X f() () >> { >> <bb 2> [0.0%]: >> # .MEM_3 = VDEF <.MEM_1(D)> >> __builtin_memset (nrvo_2(D), 0, 8); >> # VUSE <.MEM_3> >> return nrvo_2(D); >> >> } >> >> verify_gimple_return failed with: >> tail-padding1.C:13:1: error: invalid conversion in return statement >> } >> ^ >> struct X >> >> struct X & >> >> # VUSE <.MEM_3> >> return _4; >> >> It seems the return type of function (struct X) differs with the type >> of return value (struct X&). >> Not sure how this is possible ? > > You need to honor DECL_BY_REFERENCE of DECL_RESULT. Thanks! Gating on !DECL_BY_REFERENCE (DECL_RESULT (cfun->decl)) resolved the error. Does the attached version look OK ? Validation in progress.
Thanks, Prathamesh > >> To work around that, I guarded the transform on: >> useless_type_conversion_p (TREE_TYPE (TREE_TYPE (cfun->decl)), >> TREE_TYPE (retval))) >> >> in the patch. Does that look OK ? >> >> Bootstrap+tested on x86_64-unknown-linux-gnu with --enable-languages=all,ada. >> Cross-tested on arm*-*-*, aarch64*-*-*. >> >> Thanks, >> Prathamesh >> > >> > >> > Jeff >> > > -- > Richard Biener <rguent...@suse.de> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB > 21284 (AG Nuernberg)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/tailcall-9.c b/gcc/testsuite/gcc.dg/tree-ssa/tailcall-9.c new file mode 100644 index 0000000..b3fdc6c --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/tailcall-9.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-tailc-details" } */ + +void *f(void *a1, void *a2, __SIZE_TYPE__ a3) +{ + __builtin_memcpy (a1, a2, a3); + return a1; +} + +/* { dg-final { scan-tree-dump-times "Found tail call" 1 "tailc" } } */ diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c index 66a0a4c..a1c8bd7 100644 --- a/gcc/tree-tailcall.c +++ b/gcc/tree-tailcall.c @@ -401,6 +401,7 @@ find_tail_calls (basic_block bb, struct tailcall **ret) basic_block abb; size_t idx; tree var; + greturn *ret_stmt = NULL; if (!single_succ_p (bb)) return; @@ -408,6 +409,8 @@ find_tail_calls (basic_block bb, struct tailcall **ret) for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev (&gsi)) { stmt = gsi_stmt (gsi); + if (!ret_stmt) + ret_stmt = dyn_cast<greturn *> (stmt); /* Ignore labels, returns, nops, clobbers and debug stmts. */ if (gimple_code (stmt) == GIMPLE_LABEL @@ -422,6 +425,36 @@ find_tail_calls (basic_block bb, struct tailcall **ret) { call = as_a <gcall *> (stmt); ass_var = gimple_call_lhs (call); + if (!ass_var) + { + /* Check if function returns one if it's arguments + and that argument is used as return value. + In that case create an artificial lhs to call_stmt, + and set it as the return value. */ + + unsigned rf = gimple_call_return_flags (call); + if (rf & ERF_RETURNS_ARG) + { + unsigned argnum = rf & ERF_RETURN_ARG_MASK; + if (argnum < gimple_call_num_args (call) + && ret_stmt) + { + tree arg = gimple_call_arg (call, argnum); + tree retval = gimple_return_retval (ret_stmt); + if (retval + && TREE_CODE (retval) == SSA_NAME + && operand_equal_p (retval, arg, 0) + && !DECL_BY_REFERENCE (DECL_RESULT (cfun->decl))) + { + ass_var = make_ssa_name (TREE_TYPE (arg)); + gimple_call_set_lhs (call, ass_var); + update_stmt (call); + gimple_return_set_retval (ret_stmt, ass_var); + update_stmt (ret_stmt); + } + } + } + } break; }