On Wed, Apr 16, 2025 at 8:59 AM Kyle Huey via Dwarf-discuss < dwarf-discuss@lists.dwarfstd.org> wrote:
> This may be more of an implementation question than a spec question, > but this seems like the place to have the discussion regardless. > > C++ debuggers that support expression evaluation need to identify the > `this` pointer argument of member functions because C++ permits > unqualified access to member variables. DWARF 3 introduced the > DW_AT_object_pointer attribute for this purpose, and modern versions > of clang and gcc emit it. lldb actually uses this attribute when > present to find the this pointer argument, falling back to checking to > see if the first formal parameter to the function looks "this-like", > while gdb appears to use a name lookup at all times. Regardless, these > end up being interoperable for normal member functions. > > lambdas that capture the `this` of a member function present the same > need for debuggers, as unqualified accesses within the lambda can also > access member variables of `this`. Both clang and gcc desugar lambdas > to an anonymous struct type whose members are the captured variables, > with an operator() implementation that contains the body of the > lambda. They, however, different in their representation of a captured > `this` pointer. > > In clang's case the DWARF for the operator() contains a > DW_TAG_formal_parameter for a `this` parameter which is a pointer to > the anonymous struct type. That anonymous struct then contains its own > `this` member which is the captured `this`. lldb then contains code in > GetLambdaValueObject that recognizes this "double this" pattern and > deals with it appropriately. > > In gcc's case the DWARF for the operator() contains a > DW_TAG_formal_parameter for a `__closure` parameter which is a pointer > to the anonymous struct type. That anonymous struct contains a > `__this` member which is the captured `this`. Additionally, gcc emits > a DW_TAG_variable inside the operator() named `this` with the > appropriate DW_AT_location that traverses the anonymous struct (as it > does for all captured variables). > > In both cases the compilers emit a DW_AT_object_pointer on the > operator() pointing to the anonymous struct pointer parameter. > > This results in neither debugger being able to understand the output > of the opposite compiler. gdb cannot understand what clang has emitted > because it looks at the `this` parameter (which points to the > anonymous struct) and lldb cannot understand what gcc has emitted > because it expects the "double this" pattern. This is also annoying > for third party debuggers (like the one I maintain) because we need to > recognize and explicitly support both patterns. > > I haven't done any research into why the compilers chose to emit what > they do, but it seems to me[0] like things would be better if clang > copied gcc's "repeat the captured variables as locals inside > operator()" behavior (which would make gdb understand clang binaries) > and then both compilers switched their DW_AT_object_pointers to point > to the captured `this` if and only if it exists (which would make lldb > understand gcc binaries), ignoring the anonymous compiler-generated > struct entirely. Then lambdas that capture `this` would look like > member functions to debuggers and "just work" without any special > lambda-aware code. > > This would require at least some clarification in the spec since the > subprogram's DW_AT_object_pointer would point to a local, not a > parameter, and would point to an object of a different type than the > DW_TAG_class_type containing the subprogram (or would exist on a > subprogram not contained in a class at all if compilers elided the > anonymous struct from DWARF entirely). > > Any thoughts? > As a clang developer, I've some bias for the Clang representation here - and lambdas are classes (per the C++ spec) so it still makes sense to me that op() is a member function, though, yeah, having its "this" pointer given another name since users can't refer to it by that name. Introducing a bunch of locals that expose the right names/types - seems OK to me. Using object_pointer to refer to a local "this" could have other uses too - some languages (I forget which ones) have a scope based object usage, like "foo f; f.x();" -> "using (foo f) { x(); }" and so using object_pointer on the scope to refer to the local could be used to support that feature.
-- Dwarf-discuss mailing list Dwarf-discuss@lists.dwarfstd.org https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss