On 1 December 2016 at 17:40, Richard Biener <rguent...@suse.de> wrote:
> On Thu, 1 Dec 2016, Prathamesh Kulkarni wrote:
>
>> On 25 November 2016 at 21:17, Jeff Law <l...@redhat.com> wrote:
>> > On 11/25/2016 01:07 AM, Richard Biener wrote:
>> >
>> >>> For the tail-call, issue should we artificially create a lhs and use that
>> >>> as return value (perhaps by a separate pass before tailcall) ?
>> >>>
>> >>> __builtin_memcpy (a1, a2, a3);
>> >>> return a1;
>> >>>
>> >>> gets transformed to:
>> >>> _1 = __builtin_memcpy (a1, a2, a3)
>> >>> return _1;
>> >>>
>> >>> So tail-call optimization pass would see the IL in it's expected form.
>> >>
>> >>
>> >> As said, a RTL expert needs to chime in here.  Iff then tail-call
>> >> itself should do this rewrite.  But if this form is required to make
>> >> things work (I suppose you checked it _does_ actually work?) then
>> >> we'd need to make sure later passes do not undo it.  So it looks
>> >> fragile to me.  OTOH I seem to remember that the flags we set on
>> >> GIMPLE are merely a hint to RTL expansion and the tailcalling is
>> >> verified again there?
>> >
>> > So tail calling actually sits on the border between trees and RTL.
>> > Essentially it's an expand-time decision as we use information from trees 
>> > as
>> > well as low level target information.
>> >
>> > I would not expect the former sequence to tail call.  The tail calling code
>> > does not know that the return value from memcpy will be a1.  Thus the tail
>> > calling code has to assume that it'll have to copy a1 into the return
>> > register after returning from memcpy, which obviously can't be done if we
>> > tail called memcpy.
>> >
>> > The second form is much more likely to turn into a tail call sequence
>> > because the return value from memcpy will be sitting in the proper 
>> > register.
>> > This form out to work for most calling conventions that allow tail calls.
>> >
>> > We could (in theory) try and exploit the fact that memcpy returns its first
>> > argument as a return value, but that would only be helpful on a target 
>> > where
>> > the first argument and return value use the same register. So I'd have a
>> > slight preference to rewriting per Prathamesh's suggestion above since it's
>> > more general.
>> Thanks for the suggestion. The attached patch creates artificial lhs,
>> and returns it if the function returns it's argument and that argument
>> is used as return-value.
>>
>> eg:
>> f (void * a1, void * a2, long unsigned int a3)
>> {
>>   <bb 2> [0.0%]:
>>   # .MEM_5 = VDEF <.MEM_1(D)>
>>   __builtin_memcpy (a1_2(D), a2_3(D), a3_4(D));
>>   # VUSE <.MEM_5>
>>   return a1_2(D);
>>
>> }
>>
>> is transformed to:
>> f (void * a1, void * a2, long unsigned int a3)
>> {
>>   void * _6;
>>
>>   <bb 2> [0.0%]:
>>   # .MEM_5 = VDEF <.MEM_1(D)>
>>   _6 = __builtin_memcpy (a1_2(D), a2_3(D), a3_4(D));
>>   # VUSE <.MEM_5>
>>   return _6;
>>
>> }
>>
>> While testing, I came across an issue with function f() defined
>> intail-padding1.C:
>> struct X
>> {
>>   ~X() {}
>>   int n;
>>   char d;
>> };
>>
>> X f()
>> {
>>   X nrvo;
>>   __builtin_memset (&nrvo, 0, sizeof(X));
>>   return nrvo;
>> }
>>
>> input to the pass:
>> X f() ()
>> {
>>   <bb 2> [0.0%]:
>>   # .MEM_3 = VDEF <.MEM_1(D)>
>>   __builtin_memset (nrvo_2(D), 0, 8);
>>   # VUSE <.MEM_3>
>>   return nrvo_2(D);
>>
>> }
>>
>> verify_gimple_return failed with:
>> tail-padding1.C:13:1: error: invalid conversion in return statement
>>  }
>>  ^
>> struct X
>>
>> struct X &
>>
>> # VUSE <.MEM_3>
>> return _4;
>>
>> It seems the return type of function (struct X) differs with the type
>> of return value (struct X&).
>> Not sure how this is possible ?
>
> You need to honor DECL_BY_REFERENCE of DECL_RESULT.
Thanks! Gating on !DECL_BY_REFERENCE (DECL_RESULT (cfun->decl))
resolved the error.
Does the attached version look OK ?
Validation in progress.

Thanks,
Prathamesh
>
>> To work around that, I guarded the transform on:
>> useless_type_conversion_p (TREE_TYPE (TREE_TYPE (cfun->decl)),
>>                                              TREE_TYPE (retval)))
>>
>> in the patch. Does that look OK ?
>>
>> Bootstrap+tested on x86_64-unknown-linux-gnu with --enable-languages=all,ada.
>> Cross-tested on arm*-*-*, aarch64*-*-*.
>>
>> Thanks,
>> Prathamesh
>> >
>> >
>> > Jeff
>>
>
> --
> Richard Biener <rguent...@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
> 21284 (AG Nuernberg)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/tailcall-9.c 
b/gcc/testsuite/gcc.dg/tree-ssa/tailcall-9.c
new file mode 100644
index 0000000..b3fdc6c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/tailcall-9.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-tailc-details" } */
+
+void *f(void *a1, void *a2, __SIZE_TYPE__ a3)
+{
+  __builtin_memcpy (a1, a2, a3);
+  return a1;
+}
+
+/* { dg-final { scan-tree-dump-times "Found tail call" 1 "tailc" } } */ 
diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c
index 66a0a4c..a1c8bd7 100644
--- a/gcc/tree-tailcall.c
+++ b/gcc/tree-tailcall.c
@@ -401,6 +401,7 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
   basic_block abb;
   size_t idx;
   tree var;
+  greturn *ret_stmt = NULL;
 
   if (!single_succ_p (bb))
     return;
@@ -408,6 +409,8 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
   for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev (&gsi))
     {
       stmt = gsi_stmt (gsi);
+      if (!ret_stmt)
+       ret_stmt = dyn_cast<greturn *> (stmt);
 
       /* Ignore labels, returns, nops, clobbers and debug stmts.  */
       if (gimple_code (stmt) == GIMPLE_LABEL
@@ -422,6 +425,36 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
        {
          call = as_a <gcall *> (stmt);
          ass_var = gimple_call_lhs (call);
+         if (!ass_var)
+           {
+             /* Check if function returns one if it's arguments
+                and that argument is used as return value.
+                In that case create an artificial lhs to call_stmt,
+                and set it as the return value.  */
+
+             unsigned rf = gimple_call_return_flags (call);
+             if (rf & ERF_RETURNS_ARG)
+               {
+                 unsigned argnum = rf & ERF_RETURN_ARG_MASK;
+                 if (argnum < gimple_call_num_args (call)
+                     && ret_stmt)
+                   {
+                     tree arg = gimple_call_arg (call, argnum);
+                     tree retval = gimple_return_retval (ret_stmt);
+                     if (retval
+                         && TREE_CODE (retval) == SSA_NAME
+                         && operand_equal_p (retval, arg, 0)
+                         && !DECL_BY_REFERENCE (DECL_RESULT (cfun->decl)))
+                       {
+                         ass_var = make_ssa_name (TREE_TYPE (arg));
+                         gimple_call_set_lhs (call, ass_var);
+                         update_stmt (call);
+                         gimple_return_set_retval (ret_stmt, ass_var);
+                         update_stmt (ret_stmt);
+                       }
+                   }
+               }
+           }
          break;
        }
 

Reply via email to