On Fri, 8 May 2015, Tom de Vries wrote: > Hi, > > this patch fixes PR66010. > > > I. > > Consider this test-case, with a va_list passed from f2 to f2_1: > ... > #include <stdarg.h> > > inline int __attribute__((always_inline)) > f2_1 (va_list ap) > { > return va_arg (ap, int); > } > > int > f2 (int i, ...) > { > int res; > va_list ap; > > va_start (ap, i); > res = f2_1 (ap); > va_end (ap); > > return res; > } > ... > > When compiling at -O2, before eline we have: > ... > f2_1 (struct * apD.1832) > { > intD.6 _3; > > # .MEM_2 = VDEF <.MEM_1(D)> > # USE = anything > # CLB = anything > _3 = VA_ARG (&apD.1832, 0B); > > # VUSE <.MEM_2> > return _3; > } > > f2 (intD.6 iD.1835) > { > struct apD.1839[1]; > intD.6 resD.1838; > intD.6 _6; > > # .MEM_2 = VDEF <.MEM_1(D)> > # USE = anything > # CLB = anything > __builtin_va_startD.1030 (&apD.1839, 0); > > # .MEM_3 = VDEF <.MEM_2> > # USE = anything > # CLB = anything > res_4 = f2_1D.1833 (&apD.1839); > > # .MEM_5 = VDEF <.MEM_3> > # USE = anything > # CLB = anything > __builtin_va_endD.1029 (&apD.1839); > > _6 = res_4; > > # .MEM_7 = VDEF <.MEM_5> > apD.1839 ={v} {CLOBBER}; > > # VUSE <.MEM_7> > return _6; > } > ... > > Because the va_list type is an array type: > ... > struct apD.1839[1]; > ... > > we're passing the location of the initial element: > ... > res_4 = f2_1D.1833 (&apD.1839); > ... > > And the type of the parameter in f2_1 is accordingly a pointer to array > element: > ... > f2_1 (struct * apD.1832) > ... > > That means the address operator here is superfluous. > ... > _3 = VA_ARG (&apD.1832, 0B); > ... > Or, differently put, when we take the address of ap in va_start and va_end in > f2, the result is a pointer to array element type. > When we take the address of ap in f2_2, the result is a pointer to pointer to > array element type. > > This extra indirection doesn't cause wrong code to be generated. The va_arg > expansion code handles this correctly, thanks to the combination of: > - an unconditional build_fold_indirect_ref in expand_ifn_va_arg_1 which > removes > an indirection, and > - a fixup in gimplify_va_arg_internal that again adds an indirection in some > cases. > > The call gets inlined, and before pass_stdarg we have: > ... > f2 (intD.6 iD.1835) > { > struct * apD.1849; > struct apD.1839[1]; > intD.6 _6; > > # .MEM_2 = VDEF <.MEM_1(D)> > # USE = nonlocal escaped > # CLB = nonlocal escaped { D.1839 } (escaped) > __builtin_va_startD.1030 (&apD.1839, 0); > > # .MEM_3 = VDEF <.MEM_2> > apD.1849 = &apD.1839; > > # .MEM_7 = VDEF <.MEM_3> > # USE = nonlocal null { D.1839 D.1849 } (escaped) > # CLB = nonlocal null { D.1839 D.1849 } (escaped) > _6 = VA_ARG (&apD.1849, 0B); > ... > > And after expanding ifn_va_arg, we have: > ... > f2 (intD.6 iD.1835) > { > struct * ap.3D.1853; > struct apD.1839[1]; > > # .MEM_2 = VDEF <.MEM_1(D)> > # USE = nonlocal escaped > # CLB = nonlocal escaped { D.1839 } (escaped) > __builtin_va_startD.1030 (&apD.1839, 0); > > ap_3 = &apD.1839; > > ap.3_10 = ap_3; > > # VUSE <.MEM_2> > _11 = ap.3_10->gp_offsetD.2; > <SNIP> > ... > > The pass_stdarg optimization fails: > ... > f2: va_list escapes 1, needs to save all GPR units and all FPR units. > ... > > It fails on the superfluous address operator: > ... > va_list escapes in ap_3 = &apD.1839; > ... > > > II. > > The patch prevents the superfluous address operator from being added. > > It also removes the need for the fixup in gimplify_va_arg_internal, by > deciding in gimplify_va_arg_expr whether the build_fold_indirect_ref in > expand_ifn_va_arg_1 is needed before passing ap on to > gimplify_va_arg_internal. This decision is encoded as an extra argument to > ifn_va_arg. > > > III. > > Using the patch, before inlining we can see the address operator has been > removed in va_arg: > ... > f2_1 (struct * apD.1832) > { > intD.6 _4; > > # .MEM_3 = VDEF <.MEM_1(D)> > # USE = anything > # CLB = anything > > _4 = VA_ARG (ap_2(D), 0B, 0); > # VUSE <.MEM_3> > return _4; > } > ... > > And after inlining, we see that va_start and va_arg now use the same ap: > ... > f2 (intD.6 iD.1835) > { > struct apD.1839[1]; > > # .MEM_2 = VDEF <.MEM_1(D)> > # USE = anything > # CLB = anything > __builtin_va_startD.1030 (&apD.1839, 0); > > # .MEM_3 = VDEF <.MEM_2> > # USE = anything > # CLB = anything > _8 = VA_ARG (&apD.1839, 0B, 0); > ... > > That allows the pass_stdarg optimization to succeed: > ... > f2: va_list escapes 0, needs to save 8 GPR units and 0 FPR units. > ... > > Bootstrapped and reg-tested on x86_64, with and without -m32. > > OK for trunk? > > [ FWIW, I suspect this patch will make life easier for the reimplementation of > the pass_stdarg optimization using ifn_va_arg. ]
+ if (canon_va_type != NULL) + { + if (!(TREE_CODE (canon_va_type) == ARRAY_TYPE + && TREE_CODE (va_type) != ARRAY_TYPE)) + /* In gimplify_va_arg_expr we take the address of the ap argument, mark + it addressable now. */ + mark_addressable (expr); can we "simplify" this and ... - } - + gcc_assert (TREE_CODE (TREE_TYPE (valist)) != ARRAY_TYPE); gimplify_expr (&valist, pre_p, post_p, is_gimple_val, fb_rvalue); this to use [!]POINTER_TYPE_P ()? Why do we arrive with non-array va_type but array canon_va_type in build_va_arg? I suppose the va_list argument already decayed to a pointer then (in which case the argument should already be addressable?)? I think the overall idea of the patch is good - I'm just worried about special-casing of ARRAY_TYPE vs. non-pointer-type (because it's the latter that we ultimately want...). Thanks, Richard.