Hi Esben Mose Hansen writes: > this program SEGFAULTs > > #include <algorithm> > > int main() { > int numbers[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9 }; > const std::size_t nn = sizeof(numbers)/sizeof(int); > int sum = 0; > int f = 5; > std::for_each(&numbers[0], &numbers[nn], [&] (int n) { > sum += n * f; > }); > > } > ... > I am completely new to gcc hacking, just > dying to get lambda into gcc 4.5 :) > Me to on both counts! So much so that I've got a working copy of the latest lambda branch, svnmerged the latest 4.5.0 trunk into it, fixed a few build issues and started poking around. I have never ventured into gcc internals before so its all a bit alien at the mo.
> On Thursday 30 April 2009 19:19:31 Ian wrote: > > When I try to specify the capture it works ((&sum, &f) works too but f is > > const): > > > > #include <algorithm> > > > > int > > main(void) > > { > > int numbers[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9 }; > > const std::size_t nn = sizeof(numbers)/sizeof(int); > > int sum = 0; > > int f = 5; > > > > //std::for_each(&numbers[0], &numbers[nn], [&](int n) { sum += n * f; }); > > > > std::for_each(&numbers[0], &numbers[nn], [&sum, f](int n) { sum += n * f; > > }); > > > > return 0; > > } > > Yup. In fact, almost any other capture block than the [&] works :) I will try > to look at those tree options when I get sober again. > It is crashing when invoking a copied 'implicit capture' lambda. The same occurs with: int main() { char i,j,k; auto const& f = [&] { i = 1; j = 2; k = 3; }; auto g = f; g(); return i+j+k; } With explicit captures (i.e. specifying [&i,&j,&k] instead of [&] above) it works fine. Also using f() (in place of g()) is fine so the code in the lambda and the call to it must be okay. So I started looking into the instance data. The resulting lambda class in the above program is generated by the compiler to look something like the following in name and structure: struct __lambda0 { char &i,&j,&k; void operator() () const { i = 1; j = 2; k = 3; } }; Looking at the implementation in gcc/cp/{class.c,parser.c,semantics.c} it seems that, in implicit capture mode, references to enclosing scope identifiers are retrospectively added to the lambda class on first use in the lambda body. This made me worry about the class structure changing as you progress through the parse of the lambda body. I.e. at the start of the body nothing is captured -- since nothing is referenced. As you meet enclosing scope references, each is added as a capture member to the lambda class. Is this okay or has something already decided on the size and structure of the class? I figured (almost certainly naively and incorrectly) that it ought to be similar to the difference between: struct X { int i; void init_i() { i = 1; } }; and struct X { void init_i() { i = 1; } int i; }; I changed the program above to check the sizes of the generated lambda class and it is indeed as expected (three pointers). So the class has the correct size -- why does it not copy? Surely a bitwise copy in this case is sufficient and that ought to be auto-generated. -- Except we're in compiler land here -- are we supposed to do the auto-generating? To test the theory I added memcpy(&g,&f,sizeof g); after the assignment (auto g = f) to force the instance data to be copied from f to g. It worked! So why is the compiler not generating suitable code for the lambda class copy -- the size is right, but no copy of instance data is made -- maybe its already decided that the size is zero (as it was before the lambda body) and generated 'do-nothing' copy constructors? I had a poke around in gcc/cp/{class.c,parser.c,semantics.c} and believe I have worked around the issue -- the proper solution I don't think is as straight-forward (or maybe its more straight-forward for a gcc guru?). Note that the following diff hunk is after an svnmerge of trunk commits into my wc of the lambda branch so offsets will likely be wrong. I would give a full diff but my wc is in rather a messy state at the mo. I have no idea whether there are any nasty side effects to doing this but it seems to do the copy correctly afterwards. I have not looked into it much further at the mo. Thought I'd just post my findings. ########### snip ########################## --- gcc/cp/parser.c (revision 150148) +++ gcc/cp/parser.c (working copy) @@ -6936,6 +6991,24 @@ cp_parser_lambda_body (parser, lambda_expr); + /* relayout again -- to allow for implicit + * parameters to have been added to the capture if it was a + * 'default capture' -- note that this would not be necessary if + * the stack-pointer variant was implemented -- since the layout + * would be known. + * Relayingout here might have nasty effect if one were to query + * sizeof *this from within the body -- would that even be + * possible -- *this would refer to the lambda or the enclosing + * class instance -- if there was one.??? + * + * NOTE: I think this is a redefinition error; I'm just faking that + * it isn't by clearing the size node... need to use stack pointer + * method. But that will likely bring its own issues -- not with + * class layout though. + */ + TYPE_SIZE (type) = NULL_TREE; + finish_struct_1 (type); + parser->in_function_body = saved_in_function_body; parser->num_template_parameter_lists = saved_num_template_parameter_lists; --parser->num_classes_being_defined; ######################################### The way 'default reference capture' is implemented on the lambda branch seems to be kind of reactive. I would expect that inheriting the stack somehow (maybe using just a stack pointer) would be better but without more investigation I don't know if that is possible or how one would go about doing it. There is also the problem that the lambda would want its own stack also so its unlikely that you'd be able to simply drop it into the same frame as the enclosing function (though it may not be any different to having a nested scope in a function in terms of identifier ref'ing). Here is a comment I added into the code as I was going through -- apologies for not rewriting! gcc/cp/semantics.c:5308 (working copy) /* * Rather than looping through the identifiers used in the scope of * the lambda and synthesizing individual captures of them, it would * be better to ref the stack frame as transparently as possible... * e.g. given just three ints i, j and k in scope: * Instead of transforming: * * [&] { i = 1; j = 2; k = 3; }; * * into * * [&i,&j,&k] { i = 1; j = 2; k = 3; }; * * and thus storing three pointers to int, transform it into: * * [sp=enclosing-stack-pointer] { var-from-stack(i,sp) = 1; * var-from-stack(j,sp) = 2; * var-from-stack(k,sp) = 3; }; * * and only store one pointer. This presumes that var-from-stack * can be achieved with no (or negligible) cost (i.e. no worse than * the individually captured version above. Since std algorithms * generally take and return copies of the function object passed * in it is useful to keep them as small as possible. One note, * are reference_closures still in the standard? * * I don't know if its possible but it may be possible to 'append' * the lambda's stack to the existing scope rather than creating a * new constrained scope -- from a logical point of view. This * might be difficult if the compiler's design doesn't lend itself * to working like that. */ Sorry that this is a bit a of a brain dump. I'm just finding my way around and there is obviously at least some interest here. Maybe one of you could try adding the finish_struct_1() call and see how much further you get. Regards, Adam