[Dwarf-discuss] Representing captured `this` in C++ lambdas

Kyle Huey via Dwarf-discuss Wed, 16 Apr 2025 08:59:34 -0700

This may be more of an implementation question than a spec question,
but this seems like the place to have the discussion regardless.


C++ debuggers that support expression evaluation need to identify the
`this` pointer argument of member functions because C++ permits
unqualified access to member variables. DWARF 3 introduced the
DW_AT_object_pointer attribute for this purpose, and modern versions
of clang and gcc emit it. lldb actually uses this attribute when
present to find the this pointer argument, falling back to checking to
see if the first formal parameter to the function looks "this-like",
while gdb appears to use a name lookup at all times. Regardless, these
end up being interoperable for normal member functions.

lambdas that capture the `this` of a member function present the same
need for debuggers, as unqualified accesses within the lambda can also
access member variables of `this`. Both clang and gcc desugar lambdas
to an anonymous struct type whose members are the captured variables,
with an operator() implementation that contains the body of the
lambda. They, however, different in their representation of a captured
`this` pointer.

In clang's case the DWARF for the operator() contains a
DW_TAG_formal_parameter for a `this` parameter which is a pointer to
the anonymous struct type. That anonymous struct then contains its own
`this` member which is the captured `this`. lldb then contains code in
GetLambdaValueObject that recognizes this "double this" pattern and
deals with it appropriately.

In gcc's case the DWARF for the operator() contains a
DW_TAG_formal_parameter for a `__closure` parameter which is a pointer
to the anonymous struct type. That anonymous struct contains a
`__this` member which is the captured `this`. Additionally, gcc emits
a DW_TAG_variable inside the operator() named `this` with the
appropriate DW_AT_location that traverses the anonymous struct (as it
does for all captured variables).

In both cases the compilers emit a DW_AT_object_pointer on the
operator() pointing to the anonymous struct pointer parameter.

This results in neither debugger being able to understand the output
of the opposite compiler. gdb cannot understand what clang has emitted
because it looks at the `this` parameter (which points to the
anonymous struct) and lldb cannot understand what gcc has emitted
because it expects the "double this" pattern. This is also annoying
for third party debuggers (like the one I maintain) because we need to
recognize and explicitly support both patterns.

I haven't done any research into why the compilers chose to emit what
they do, but it seems to me[0] like things would be better if clang
copied gcc's "repeat the captured variables as locals inside
operator()" behavior (which would make gdb understand clang binaries)
and then both compilers switched their DW_AT_object_pointers to point
to the captured `this` if and only if it exists (which would make lldb
understand gcc binaries), ignoring the anonymous compiler-generated
struct entirely. Then lambdas that capture `this` would look like
member functions to debuggers and "just work" without any special
lambda-aware code.

This would require at least some clarification in the spec since the
subprogram's DW_AT_object_pointer would point to a local, not a
parameter, and would point to an object of a different type than the
DW_TAG_class_type containing the subprogram (or would exist on a
subprogram not contained in a class at all if compilers elided the
anonymous struct from DWARF entirely).

Any thoughts?

- Kyle

[0] At the risk of https://xkcd.com/927/
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss

[Dwarf-discuss] Representing captured `this` in C++ lambdas

Reply via email to