On Wed, Apr 16, 2025 at 8:59 AM Kyle Huey via Dwarf-discuss <
dwarf-discuss@lists.dwarfstd.org> wrote:

> This may be more of an implementation question than a spec question,
> but this seems like the place to have the discussion regardless.
>
> C++ debuggers that support expression evaluation need to identify the
> `this` pointer argument of member functions because C++ permits
> unqualified access to member variables. DWARF 3 introduced the
> DW_AT_object_pointer attribute for this purpose, and modern versions
> of clang and gcc emit it. lldb actually uses this attribute when
> present to find the this pointer argument, falling back to checking to
> see if the first formal parameter to the function looks "this-like",
> while gdb appears to use a name lookup at all times. Regardless, these
> end up being interoperable for normal member functions.
>
> lambdas that capture the `this` of a member function present the same
> need for debuggers, as unqualified accesses within the lambda can also
> access member variables of `this`. Both clang and gcc desugar lambdas
> to an anonymous struct type whose members are the captured variables,
> with an operator() implementation that contains the body of the
> lambda. They, however, different in their representation of a captured
> `this` pointer.
>
> In clang's case the DWARF for the operator() contains a
> DW_TAG_formal_parameter for a `this` parameter which is a pointer to
> the anonymous struct type. That anonymous struct then contains its own
> `this` member which is the captured `this`. lldb then contains code in
> GetLambdaValueObject that recognizes this "double this" pattern and
> deals with it appropriately.
>
> In gcc's case the DWARF for the operator() contains a
> DW_TAG_formal_parameter for a `__closure` parameter which is a pointer
> to the anonymous struct type. That anonymous struct contains a
> `__this` member which is the captured `this`. Additionally, gcc emits
> a DW_TAG_variable inside the operator() named `this` with the
> appropriate DW_AT_location that traverses the anonymous struct (as it
> does for all captured variables).
>
> In both cases the compilers emit a DW_AT_object_pointer on the
> operator() pointing to the anonymous struct pointer parameter.
>
> This results in neither debugger being able to understand the output
> of the opposite compiler. gdb cannot understand what clang has emitted
> because it looks at the `this` parameter (which points to the
> anonymous struct) and lldb cannot understand what gcc has emitted
> because it expects the "double this" pattern. This is also annoying
> for third party debuggers (like the one I maintain) because we need to
> recognize and explicitly support both patterns.
>
> I haven't done any research into why the compilers chose to emit what
> they do, but it seems to me[0] like things would be better if clang
> copied gcc's "repeat the captured variables as locals inside
> operator()" behavior (which would make gdb understand clang binaries)
> and then both compilers switched their DW_AT_object_pointers to point
> to the captured `this` if and only if it exists (which would make lldb
> understand gcc binaries), ignoring the anonymous compiler-generated
> struct entirely. Then lambdas that capture `this` would look like
> member functions to debuggers and "just work" without any special
> lambda-aware code.
>
> This would require at least some clarification in the spec since the
> subprogram's DW_AT_object_pointer would point to a local, not a
> parameter, and would point to an object of a different type than the
> DW_TAG_class_type containing the subprogram (or would exist on a
> subprogram not contained in a class at all if compilers elided the
> anonymous struct from DWARF entirely).
>
> Any thoughts?
>

As a clang developer, I've some bias for the Clang representation here -
and lambdas are classes (per the C++ spec) so it still makes sense to me
that op() is a member function, though, yeah, having its "this" pointer
given another name since users can't refer to it by that name.

Introducing a bunch of locals that expose the right names/types - seems OK
to me.

Using object_pointer to refer to a local "this" could have other uses too -
some languages (I forget which ones) have a scope based object usage, like
"foo f; f.x();" -> "using (foo f) { x(); }" and so using object_pointer on
the scope to refer to the local could be used to support that feature.
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss

Reply via email to