Hi

Esben Mose Hansen writes:
> this program SEGFAULTs
>
> #include <algorithm>
>
> int main() {
>   int numbers[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
>   const std::size_t nn = sizeof(numbers)/sizeof(int);
>   int sum = 0;
>   int f = 5;
>   std::for_each(&numbers[0], &numbers[nn],  [&]  (int n)  {
>     sum += n * f;
>   });
>
> }
> ...
> I am completely new to gcc hacking, just
> dying to get lambda into gcc 4.5 :)
>
Me to on both counts!  So much so that I've got a working copy of the latest 
lambda branch, svnmerged the latest 4.5.0
trunk into it, fixed a few build issues and started poking around.  I have 
never ventured into gcc internals before so
its all a bit alien at the mo.

> On Thursday 30 April 2009 19:19:31 Ian wrote:
> > When I try to specify the capture it works ((&sum, &f) works too but f is
> > const):
> >
> > #include <algorithm>
> >
> > int
> > main(void)
> > {
> >   int numbers[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
> >   const std::size_t nn = sizeof(numbers)/sizeof(int);
> >   int sum = 0;
> >   int f = 5;
> >
> >   //std::for_each(&numbers[0], &numbers[nn], [&](int n) { sum += n * f; });
> >
> >   std::for_each(&numbers[0], &numbers[nn], [&sum, f](int n) { sum += n * f;
> > });
> >
> >   return 0;
> > }
>
>  Yup. In fact, almost any other capture block than the [&] works :) I will try
>  to look at those tree options when I get sober again.
>
It is crashing when invoking a copied 'implicit capture' lambda.  The same 
occurs with:

  int main()
  {
     char i,j,k;
     auto const& f = [&] { i = 1; j = 2; k = 3; };
     auto g = f;
     g();
     return i+j+k;
  }

With explicit captures (i.e. specifying [&i,&j,&k] instead of [&] above) it 
works fine.  Also using f() (in place of
g()) is fine so the code in the lambda and the call to it must be okay.  So I 
started looking into the instance data.

The resulting lambda class in the above program is generated by the compiler to 
look something like the following in
name and structure:

   struct __lambda0
   {
      char &i,&j,&k;
      void operator() () const
      {
         i = 1; j = 2; k = 3;
      }
   };

Looking at the implementation in gcc/cp/{class.c,parser.c,semantics.c} it seems 
that, in implicit capture mode,
references to enclosing scope identifiers are retrospectively added to the 
lambda class on first use in the lambda
body.  This made me worry about the class structure changing as you progress 
through the parse of the lambda body. 
I.e. at the start of the body nothing is captured -- since nothing is 
referenced.  As you meet enclosing scope
references, each is added as a capture member to the lambda class.  Is this 
okay or has something already decided on
the size and structure of the class?  I figured (almost certainly naively and 
incorrectly) that it ought to be similar
to the difference between:

   struct X
   {
      int i;
      void init_i() { i = 1; }
   };

and

   struct X
   {
      void init_i() { i = 1; }
      int i;
   };

I changed the program above to check the sizes of the generated lambda class 
and it is indeed as expected (three
pointers). So the class has the correct size -- why does it not copy?  Surely a 
bitwise copy in this case is
sufficient and that ought to be auto-generated.  -- Except we're in compiler 
land here -- are we supposed to do the
auto-generating?  To test the theory I added

     memcpy(&g,&f,sizeof g);

after the assignment (auto g = f) to force the instance data to be copied from 
f to g.  It worked!  So why is the
compiler not generating suitable code for the lambda class copy -- the size is 
right, but no copy of instance data is
made -- maybe its already decided that the size is zero (as it was before the 
lambda body) and generated 'do-nothing'
copy constructors?

I had a poke around in gcc/cp/{class.c,parser.c,semantics.c} and believe I have 
worked around the issue -- the proper
solution I don't think is as straight-forward (or maybe its more 
straight-forward for a gcc guru?).

Note that the following diff hunk is after an svnmerge of trunk commits into my 
wc of the lambda branch so offsets
will likely be wrong.  I would give a full diff but my wc is in rather a messy 
state at the mo.  I have no idea
whether there are any nasty side effects to doing this but it seems to do the 
copy correctly afterwards.  I have not
looked into it much further at the mo.  Thought I'd just post my findings.

########### snip ##########################
--- gcc/cp/parser.c     (revision 150148)
+++ gcc/cp/parser.c     (working copy)
@@ -6936,6 +6991,24 @@

     cp_parser_lambda_body (parser, lambda_expr);

+    /* relayout again -- to allow for implicit
+     * parameters to have been added to the capture if it was a
+     * 'default capture' -- note that this would not be necessary if
+     * the stack-pointer variant was implemented -- since the layout
+     * would be known.
+     * Relayingout here might have nasty effect if one were to query
+     * sizeof *this from within the body -- would that even be
+     * possible -- *this would refer to the lambda or the enclosing
+     * class instance -- if there was one.???
+     *
+     * NOTE: I think this is a redefinition error; I'm just faking that
+     * it isn't by clearing the size node... need to use stack pointer
+     * method.  But that will likely bring its own issues -- not with
+     * class layout though.
+     */
+    TYPE_SIZE (type) = NULL_TREE;
+    finish_struct_1 (type);
+
     parser->in_function_body = saved_in_function_body;
     parser->num_template_parameter_lists = saved_num_template_parameter_lists;
     --parser->num_classes_being_defined;
#########################################

The way 'default reference capture' is implemented on the lambda branch seems 
to be kind of reactive.  I would expect
that inheriting the stack somehow (maybe using just a stack pointer) would be 
better but without more investigation I
don't know if that is possible or how one would go about doing it.  There is 
also the problem that the lambda would
want its own stack also so its unlikely that you'd be able to simply drop it 
into the same frame as the enclosing
function (though it may not be any different to having a nested scope in a 
function in terms of identifier ref'ing). 
Here is a comment I added into the code as I was going through -- apologies for 
not rewriting!

gcc/cp/semantics.c:5308 (working copy)
/*
 * Rather than looping through the identifiers used in the scope of
 * the lambda and synthesizing individual captures of them, it would
 * be better to ref the stack frame as transparently as possible...
 * e.g. given just three ints i, j and k in scope:
 * Instead of transforming:
 *
 *    [&] { i = 1; j = 2; k = 3; };
 *
 * into
 *
 *    [&i,&j,&k] { i = 1; j = 2; k = 3; };
 *
 * and thus storing three pointers to int, transform it into:
 *
 *    [sp=enclosing-stack-pointer] { var-from-stack(i,sp) = 1;
 *                                   var-from-stack(j,sp) = 2;
 *                                   var-from-stack(k,sp) = 3; };
 *
 * and only store one pointer.  This presumes that var-from-stack
 * can be achieved with no (or negligible) cost (i.e. no worse than
 * the individually captured version above.  Since std algorithms
 * generally take and return copies of the function object passed
 * in it is useful to keep them as small as possible.  One note,
 * are reference_closures still in the standard?
 *
 * I don't know if its possible but it may be possible to 'append'
 * the lambda's stack to the existing scope rather than creating a
 * new constrained scope -- from a logical point of view.  This
 * might be difficult if the compiler's design doesn't lend itself
 * to working like that.
 */

Sorry that this is a bit a of a brain dump.  I'm just finding my way around and 
there is obviously at least some
interest here.  Maybe one of you could try adding the finish_struct_1() call 
and see how much further you get.

Regards,
Adam


Reply via email to