Re: [PATCH][PR66010] Don't take address of ap unless necessary

Richard Biener Mon, 11 May 2015 00:47:57 -0700

On Fri, 8 May 2015, Tom de Vries wrote:

> Hi,
> 
> this patch fixes PR66010.
> 
> 
> I.
> 
> Consider this test-case, with a va_list passed from f2 to f2_1:
> ...
> #include <stdarg.h>
> 
> inline int __attribute__((always_inline))
> f2_1 (va_list ap)
> {
>   return va_arg (ap, int);
> }
> 
> int
> f2 (int i, ...)
> {
>   int res;
>   va_list ap;
> 
>   va_start (ap, i);
>   res = f2_1 (ap);
>   va_end (ap);
> 
>   return res;
> }
> ...
> 
> When compiling at -O2, before eline we have:
> ...
> f2_1 (struct  * apD.1832)
> {
>   intD.6 _3;
> 
>   # .MEM_2 = VDEF <.MEM_1(D)>
>   # USE = anything
>   # CLB = anything
>   _3 = VA_ARG (&apD.1832, 0B);
> 
>   # VUSE <.MEM_2>
>   return _3;
> }
> 
> f2 (intD.6 iD.1835)
> {
>   struct  apD.1839[1];
>   intD.6 resD.1838;
>   intD.6 _6;
> 
>   # .MEM_2 = VDEF <.MEM_1(D)>
>   # USE = anything
>   # CLB = anything
>   __builtin_va_startD.1030 (&apD.1839, 0);
> 
>   # .MEM_3 = VDEF <.MEM_2>
>   # USE = anything
>   # CLB = anything
>   res_4 = f2_1D.1833 (&apD.1839);
> 
>   # .MEM_5 = VDEF <.MEM_3>
>   # USE = anything
>   # CLB = anything
>   __builtin_va_endD.1029 (&apD.1839);
> 
>   _6 = res_4;
> 
>   # .MEM_7 = VDEF <.MEM_5>
>   apD.1839 ={v} {CLOBBER};
> 
>   # VUSE <.MEM_7>
>   return _6;
> }
> ...
> 
> Because the va_list type is an array type:
> ...
>   struct  apD.1839[1];
> ...
> 
> we're passing the location of the initial element:
> ...
>   res_4 = f2_1D.1833 (&apD.1839);
> ...
> 
> And the type of the parameter in f2_1 is accordingly a pointer to array
> element:
> ...
> f2_1 (struct  * apD.1832)
> ...
> 
> That means the address operator here is superfluous.
> ...
>   _3 = VA_ARG (&apD.1832, 0B);
> ...
> Or, differently put, when we take the address of ap in va_start and va_end in
> f2, the result is a pointer to array element type.
> When we take the address of ap in f2_2, the result is a pointer to pointer to
> array element type.
> 
> This extra indirection doesn't cause wrong code to be generated. The va_arg
> expansion code handles this correctly, thanks to the combination of:
> - an unconditional build_fold_indirect_ref in expand_ifn_va_arg_1 which
> removes
>   an indirection, and
> - a fixup in gimplify_va_arg_internal that again adds an indirection in some
>   cases.
> 
> The call gets inlined, and before pass_stdarg we have:
> ...
> f2 (intD.6 iD.1835)
> {
>   struct  * apD.1849;
>   struct  apD.1839[1];
>   intD.6 _6;
> 
>   # .MEM_2 = VDEF <.MEM_1(D)>
>   # USE = nonlocal escaped
>   # CLB = nonlocal escaped { D.1839 } (escaped)
>   __builtin_va_startD.1030 (&apD.1839, 0);
> 
>   # .MEM_3 = VDEF <.MEM_2>
>   apD.1849 = &apD.1839;
> 
>   # .MEM_7 = VDEF <.MEM_3>
>   # USE = nonlocal null { D.1839 D.1849 } (escaped)
>   # CLB = nonlocal null { D.1839 D.1849 } (escaped)
>   _6 = VA_ARG (&apD.1849, 0B);
> ...
> 
> And after expanding ifn_va_arg, we have:
> ...
> f2 (intD.6 iD.1835)
> {
>   struct  * ap.3D.1853;
>   struct  apD.1839[1];
> 
>   # .MEM_2 = VDEF <.MEM_1(D)>
>   # USE = nonlocal escaped
>   # CLB = nonlocal escaped { D.1839 } (escaped)
>   __builtin_va_startD.1030 (&apD.1839, 0);
> 
>   ap_3 = &apD.1839;
> 
>   ap.3_10 = ap_3;
> 
>   # VUSE <.MEM_2>
>   _11 = ap.3_10->gp_offsetD.2;
>   <SNIP>
> ...
> 
> The pass_stdarg optimization fails:
> ...
> f2: va_list escapes 1, needs to save all GPR units and all FPR units.
> ...
> 
> It fails on the superfluous address operator:
> ...
> va_list escapes in ap_3 = &apD.1839;
> ...
> 
> 
> II.
> 
> The patch prevents the superfluous address operator from being added.
> 
> It also removes the need for the fixup in gimplify_va_arg_internal, by
> deciding in gimplify_va_arg_expr whether the build_fold_indirect_ref in
> expand_ifn_va_arg_1 is needed before passing ap on to
> gimplify_va_arg_internal. This decision is encoded as an extra argument to
> ifn_va_arg.
> 
> 
> III.
> 
> Using the patch, before inlining we can see the address operator has been
> removed in va_arg:
> ...
> f2_1 (struct  * apD.1832)
> {
>   intD.6 _4;
> 
>   # .MEM_3 = VDEF <.MEM_1(D)>
>   # USE = anything
>   # CLB = anything
> 
>   _4 = VA_ARG (ap_2(D), 0B, 0);
>   # VUSE <.MEM_3>
>   return _4;
> }
> ...
> 
> And after inlining, we see that va_start and va_arg now use the same ap:
> ...
> f2 (intD.6 iD.1835)
> {
>   struct  apD.1839[1];
> 
>   # .MEM_2 = VDEF <.MEM_1(D)>
>   # USE = anything
>   # CLB = anything
>   __builtin_va_startD.1030 (&apD.1839, 0);
> 
>   # .MEM_3 = VDEF <.MEM_2>
>   # USE = anything
>   # CLB = anything
>   _8 = VA_ARG (&apD.1839, 0B, 0);
> ...
> 
> That allows the pass_stdarg optimization to succeed:
> ...
> f2: va_list escapes 0, needs to save 8 GPR units and 0 FPR units.
> ...
> 
> Bootstrapped and reg-tested on x86_64, with and without -m32.
> 
> OK for trunk?
> 
> [ FWIW, I suspect this patch will make life easier for the reimplementation of
> the pass_stdarg optimization using ifn_va_arg. ]


+  if (canon_va_type != NULL)
+    {
+      if (!(TREE_CODE (canon_va_type) == ARRAY_TYPE
+           && TREE_CODE (va_type) != ARRAY_TYPE))
+       /* In gimplify_va_arg_expr we take the address of the ap argument, 
mark
+          it addressable now.  */
+       mark_addressable (expr);

can we "simplify" this and ...

-       }
-
+      gcc_assert (TREE_CODE (TREE_TYPE (valist)) != ARRAY_TYPE);
       gimplify_expr (&valist, pre_p, post_p, is_gimple_val, fb_rvalue);

this to use [!]POINTER_TYPE_P ()?  Why do we arrive with non-array
va_type but array canon_va_type in build_va_arg?  I suppose the
va_list argument already decayed to a pointer then (in which case
the argument should already be addressable?)?

I think the overall idea of the patch is good - I'm just worried about
special-casing of ARRAY_TYPE vs. non-pointer-type (because it's the
latter that we ultimately want...).

Thanks,
Richard.

Re: [PATCH][PR66010] Don't take address of ap unless necessary

Reply via email to