Re: [Dwarf-discuss] Representing captured `this` in C++ lambdas

Kyle Huey via Dwarf-discuss Tue, 22 Apr 2025 23:24:27 -0700

On Wed, Apr 16, 2025 at 2:10 PM David Blaikie <dblai...@gmail.com> wrote:
>
>
>
> On Wed, Apr 16, 2025 at 8:59 AM Kyle Huey via Dwarf-discuss 
> <dwarf-discuss@lists.dwarfstd.org> wrote:
>>
>> This may be more of an implementation question than a spec question,
>> but this seems like the place to have the discussion regardless.
>>
>> C++ debuggers that support expression evaluation need to identify the
>> `this` pointer argument of member functions because C++ permits
>> unqualified access to member variables. DWARF 3 introduced the
>> DW_AT_object_pointer attribute for this purpose, and modern versions
>> of clang and gcc emit it. lldb actually uses this attribute when
>> present to find the this pointer argument, falling back to checking to
>> see if the first formal parameter to the function looks "this-like",
>> while gdb appears to use a name lookup at all times. Regardless, these
>> end up being interoperable for normal member functions.
>>
>> lambdas that capture the `this` of a member function present the same
>> need for debuggers, as unqualified accesses within the lambda can also
>> access member variables of `this`. Both clang and gcc desugar lambdas
>> to an anonymous struct type whose members are the captured variables,
>> with an operator() implementation that contains the body of the
>> lambda. They, however, different in their representation of a captured
>> `this` pointer.
>>
>> In clang's case the DWARF for the operator() contains a
>> DW_TAG_formal_parameter for a `this` parameter which is a pointer to
>> the anonymous struct type. That anonymous struct then contains its own
>> `this` member which is the captured `this`. lldb then contains code in
>> GetLambdaValueObject that recognizes this "double this" pattern and
>> deals with it appropriately.
>>
>> In gcc's case the DWARF for the operator() contains a
>> DW_TAG_formal_parameter for a `__closure` parameter which is a pointer
>> to the anonymous struct type. That anonymous struct contains a
>> `__this` member which is the captured `this`. Additionally, gcc emits
>> a DW_TAG_variable inside the operator() named `this` with the
>> appropriate DW_AT_location that traverses the anonymous struct (as it
>> does for all captured variables).
>>
>> In both cases the compilers emit a DW_AT_object_pointer on the
>> operator() pointing to the anonymous struct pointer parameter.
>>
>> This results in neither debugger being able to understand the output
>> of the opposite compiler. gdb cannot understand what clang has emitted
>> because it looks at the `this` parameter (which points to the
>> anonymous struct) and lldb cannot understand what gcc has emitted
>> because it expects the "double this" pattern. This is also annoying
>> for third party debuggers (like the one I maintain) because we need to
>> recognize and explicitly support both patterns.
>>
>> I haven't done any research into why the compilers chose to emit what
>> they do, but it seems to me[0] like things would be better if clang
>> copied gcc's "repeat the captured variables as locals inside
>> operator()" behavior (which would make gdb understand clang binaries)
>> and then both compilers switched their DW_AT_object_pointers to point
>> to the captured `this` if and only if it exists (which would make lldb
>> understand gcc binaries), ignoring the anonymous compiler-generated
>> struct entirely. Then lambdas that capture `this` would look like
>> member functions to debuggers and "just work" without any special
>> lambda-aware code.
>>
>> This would require at least some clarification in the spec since the
>> subprogram's DW_AT_object_pointer would point to a local, not a
>> parameter, and would point to an object of a different type than the
>> DW_TAG_class_type containing the subprogram (or would exist on a
>> subprogram not contained in a class at all if compilers elided the
>> anonymous struct from DWARF entirely).
>>
>> Any thoughts?
>
>
> As a clang developer, I've some bias for the Clang representation here


Sure.

>  - and lambdas are classes (per the C++ spec) so it still makes sense to me 
> that op() is a member function, though, yeah, having its "this" pointer given 
> another name since users can't refer to it by that name.

Do you object to anything I proposed other than removing the
representation of the anonymous class compilers generate for lambdas?
Because that's not really essential to what I want to do if we're fine
with DW_AT_object_pointer pointing to a variable that doesn't share a
containing type.

Now that I've thought about it a bit more I think that is not
particularly weird. Rust for example allows

fn member(self: Rc<Self>, arg1: Foo, ...)

where the type of the receiver can be certain things that are
convertible to the containing object but are definitely not the
containing object. rustc doesn't emit DW_AT_object_pointer today but
it should.

- Kyle


> Introducing a bunch of locals that expose the right names/types - seems OK to 
> me.
>
> Using object_pointer to refer to a local "this" could have other uses too - 
> some languages (I forget which ones) have a scope based object usage, like 
> "foo f; f.x();" -> "using (foo f) { x(); }" and so using object_pointer on 
> the scope to refer to the local could be used to support that feature.

- Kyle
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss

Re: [Dwarf-discuss] Representing captured `this` in C++ lambdas

Reply via email to