Re: [Dwarf-discuss] Representing captured `this` in C++ lambdas

2025-04-23 Thread Kyle Huey via Dwarf-discuss
On Wed, Apr 23, 2025 at 3:14 PM David Blaikie  wrote:
>
>
>
> On Wed, Apr 23, 2025 at 2:53 PM Kyle Huey  wrote:
>>
>> On Wed, Apr 23, 2025 at 2:20 PM David Blaikie  wrote:
>> >>
>> >> Do you object to anything I proposed other than removing the
>> >> representation of the anonymous class compilers generate for lambdas?
>> >
>> >
>> > I'm not a /super/ fan of introducing a bunch of locals in addition to the 
>> > member descriptions - it'll be a bunch of extra DWARF that'd be nice to 
>> > avoid if we can...
>>
>> Yeah, that would be the reason to get rid of the representation of the
>> anonymous struct. Then we're just converting members into locals
>> rather than duplicating anything. Even if you really want to keep the
>> class itself, the members could be dropped.
>
>
> Except it's likely users will want to inspect the state of a lambda in some 
> situations. They get passed around, stored (in std::functions or similar 
> type-erased things, often), etc in many cases & may be important to know what 
> they represent when not near/in a call to the lambda. So having the members 
> described seems important.

Mmm, yes, ok.

- Kyle

>>
>> > But putting object_pointer on the class member that stores "this" seems 
>> > problematic since that's effectively at the same scope as the real object 
>> > pointer - it'd be awkward to say there's two "this" at the same scope and 
>> > have to say that the member variable "this" shadows the real "this" in 
>> > some way.
>> >
>> > And then you want the captured variables to be in a scope that is inside 
>> > the "this" scope so they override unqualified lookup for any names that 
>> > are also members of "this"...
>> >
>> > So, yeah, I get why you/gcc developers arrived where they did. I wouldn't 
>> > mind some size analysis to see how bad the regression/cost is, that might 
>> > help inform whether it's worth trying to address it.
>>
>> Hmm. I could look at hacking something up to measure but it's not at
>> the top of my priority list.
>
>
> Fair.
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Representing captured `this` in C++ lambdas

2025-04-23 Thread David Blaikie via Dwarf-discuss
>
> Do you object to anything I proposed other than removing the
> representation of the anonymous class compilers generate for lambdas?
>

I'm not a /super/ fan of introducing a bunch of locals in addition to the
member descriptions - it'll be a bunch of extra DWARF that'd be nice to
avoid if we can...

But putting object_pointer on the class member that stores "this" seems
problematic since that's effectively at the same scope as the real object
pointer - it'd be awkward to say there's two "this" at the same scope and
have to say that the member variable "this" shadows the real "this" in some
way.

And then you want the captured variables to be in a scope that is inside
the "this" scope so they override unqualified lookup for any names that are
also members of "this"...

So, yeah, I get why you/gcc developers arrived where they did. I wouldn't
mind some size analysis to see how bad the regression/cost is, that might
help inform whether it's worth trying to address it.


> Because that's not really essential to what I want to do if we're fine
> with DW_AT_object_pointer pointing to a variable that doesn't share a
> containing type.
>

Yeah, this general idea I think I'm down with - if it generalizes to
scope-based "this" in other languages/features, etc, that seems good.


>
> Now that I've thought about it a bit more I think that is not
> particularly weird. Rust for example allows
>
> fn member(self: Rc, arg1: Foo, ...)
>
> where the type of the receiver can be certain things that are
> convertible to the containing object but are definitely not the
> containing object. rustc doesn't emit DW_AT_object_pointer today but
> it should.
>
> - Kyle
>
>
> > Introducing a bunch of locals that expose the right names/types - seems
> OK to me.
> >
> > Using object_pointer to refer to a local "this" could have other uses
> too - some languages (I forget which ones) have a scope based object usage,
> like "foo f; f.x();" -> "using (foo f) { x(); }" and so using
> object_pointer on the scope to refer to the local could be used to support
> that feature.
>
> - Kyle
>
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Representing captured `this` in C++ lambdas

2025-04-23 Thread Kyle Huey via Dwarf-discuss
On Wed, Apr 23, 2025 at 2:20 PM David Blaikie  wrote:
>>
>> Do you object to anything I proposed other than removing the
>> representation of the anonymous class compilers generate for lambdas?
>
>
> I'm not a /super/ fan of introducing a bunch of locals in addition to the 
> member descriptions - it'll be a bunch of extra DWARF that'd be nice to avoid 
> if we can...

Yeah, that would be the reason to get rid of the representation of the
anonymous struct. Then we're just converting members into locals
rather than duplicating anything. Even if you really want to keep the
class itself, the members could be dropped.

> But putting object_pointer on the class member that stores "this" seems 
> problematic since that's effectively at the same scope as the real object 
> pointer - it'd be awkward to say there's two "this" at the same scope and 
> have to say that the member variable "this" shadows the real "this" in some 
> way.
>
> And then you want the captured variables to be in a scope that is inside the 
> "this" scope so they override unqualified lookup for any names that are also 
> members of "this"...
>
> So, yeah, I get why you/gcc developers arrived where they did. I wouldn't 
> mind some size analysis to see how bad the regression/cost is, that might 
> help inform whether it's worth trying to address it.

Hmm. I could look at hacking something up to measure but it's not at
the top of my priority list.

- Kyle

>>
>> Because that's not really essential to what I want to do if we're fine
>> with DW_AT_object_pointer pointing to a variable that doesn't share a
>> containing type.
>
>
> Yeah, this general idea I think I'm down with - if it generalizes to 
> scope-based "this" in other languages/features, etc, that seems good.
>
>>
>>
>> Now that I've thought about it a bit more I think that is not
>> particularly weird. Rust for example allows
>>
>> fn member(self: Rc, arg1: Foo, ...)
>>
>> where the type of the receiver can be certain things that are
>> convertible to the containing object but are definitely not the
>> containing object. rustc doesn't emit DW_AT_object_pointer today but
>> it should.
>>
>> - Kyle
>>
>>
>> > Introducing a bunch of locals that expose the right names/types - seems OK 
>> > to me.
>> >
>> > Using object_pointer to refer to a local "this" could have other uses too 
>> > - some languages (I forget which ones) have a scope based object usage, 
>> > like "foo f; f.x();" -> "using (foo f) { x(); }" and so using 
>> > object_pointer on the scope to refer to the local could be used to support 
>> > that feature.
>>
>> - Kyle
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Representing vtables in DWARF for downcasting

2025-04-23 Thread Kyle Huey via Dwarf-discuss
On Wed, Apr 23, 2025 at 8:45 AM Michael Buch  wrote:
>
> Sounds like this is what
> https://github.com/llvm/llvm-project/pull/130255 is trying to achieve?

Yes, though that may be trying to achieve other things too (there's
some discussion of trying to go from the class definition in the DWARF
to the vtable pointer which I haven't considered and don't immediately
see a use case for).

- Kyle

> If we could simplify that part of LLDB that'd be great!
>
> On Wed, 23 Apr 2025 at 16:32, Kyle Huey via Dwarf-discuss
>  wrote:
> >
> > Consider the following C++ program
> >
> > #include 
> >
> > class Base {
> > public:
> >   virtual const char* method1() = 0;
> >   void method2() {
> > printf("%s\n", method1());
> >   }
> > };
> >
> > class DerivedOne : public Base {
> >   virtual const char* method1() override {
> > return "DerivedOne";
> >   }
> > };
> >
> > template
> > class DerivedTwo : public Base {
> > public:
> >   DerivedTwo(T t) : t(t) {}
> > private:
> >   virtual const char* method1() override {
> > return t;
> >   }
> >   T t;
> > };
> >
> > template
> > class DerivedThree : public Base {
> > public:
> >   DerivedThree(T t) : t(t) {}
> > private:
> >   virtual const char* method1() override {
> > return t();
> >   }
> >   T t;
> > };
> >
> > int main() {
> >   DerivedOne d1;
> >   DerivedTwo d2("DerivedTwo");
> >   DerivedThree d3([]() {
> > return "DerivedThree";
> >   });
> >   d1.method2();
> >   d2.method2();
> >   d3.method2();
> >   return 0;
> > }
> >
> > If a debugger stops at method1, the DW_TAG_formal_parameter will tell
> > the debugger the type of `this` is Base. Downcasting to the derived
> > type is very useful for the programmer though, so both gdb and lldb
> > contain a feature to downcast based on the vtable pointer (the "print
> > object" and the "target.prefer-dynamic" settings in the respective
> > debuggers).
> >
> > The first part of this is straightforward. The DWARF for Base will
> > contain a member for the vtable pointer, and that plus knowledge of
> > how the ABI lays out vtables allows the debugger to effectively do a
> > dynamic_cast to obtain a pointer to the most derived object.
> > From there the vtable address is compared against the ELF symbol table
> > to find the mangled name of the vtable symbol.
> >
> > Then things begin to get hairy, the debugger demangles the mangled
> > name that exists in the ELF symbol table, chops off the "vtable for "
> > prefix on the demangled name, and searches for the type by name in the
> > DWARF. If it finds the type, it adjusts the type of the value and
> > prints it accordingly. But this text based matching doesn't always
> > work. There are no mangled names for types so the debugger's
> > demangling has to match the compiler's output character for character.
> >
> > In the example program I've provided, when using the respective
> > compilers, gdb can successfully downcast DerivedOne and DerivedThree
> > but not DerivedTwo. gdb fails because gcc emits the DW_TAG_class_type
> > with a DW_AT_name "DerivedTwo >" but libiberty
> > demangles the vtable symbol to "vtable for
> > DerivedTwo" and those do not match. lldb can only
> > successfully downcast DerivedOne. lldb appears to not handle classes
> > with template parameters correctly at all. And even if all of that
> > were fixed, libiberty and llvm disagree about how to demangle the
> > symbol for DerivedTwo's vtable, so the two ecosystems would not be
> > interoperable.
> >
> > Perhaps these are merely quality of implementation issues and belong
> > on the respective bug trackers, however, better representations are
> > possible. Rustc, for example, does not rely on the ELF symbol table
> > and demangled string matching. It emits a global variable in the DWARF
> > whose location is the address of the vtable. That variable has a
> > DW_AT_type pointing to a DW_TAG_class_type that describes the layout
> > of the vtable, and that type has a DW_AT_containing_type that points
> > to the type making use of that vtable.
> >
> > Any thoughts?
> >
> > - Kyle
> > --
> > Dwarf-discuss mailing list
> > Dwarf-discuss@lists.dwarfstd.org
> > https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Representing vtables in DWARF for downcasting

2025-04-23 Thread Michael Buch via Dwarf-discuss
Sounds like this is what
https://github.com/llvm/llvm-project/pull/130255 is trying to achieve?
If we could simplify that part of LLDB that'd be great!

On Wed, 23 Apr 2025 at 16:32, Kyle Huey via Dwarf-discuss
 wrote:
>
> Consider the following C++ program
>
> #include 
>
> class Base {
> public:
>   virtual const char* method1() = 0;
>   void method2() {
> printf("%s\n", method1());
>   }
> };
>
> class DerivedOne : public Base {
>   virtual const char* method1() override {
> return "DerivedOne";
>   }
> };
>
> template
> class DerivedTwo : public Base {
> public:
>   DerivedTwo(T t) : t(t) {}
> private:
>   virtual const char* method1() override {
> return t;
>   }
>   T t;
> };
>
> template
> class DerivedThree : public Base {
> public:
>   DerivedThree(T t) : t(t) {}
> private:
>   virtual const char* method1() override {
> return t();
>   }
>   T t;
> };
>
> int main() {
>   DerivedOne d1;
>   DerivedTwo d2("DerivedTwo");
>   DerivedThree d3([]() {
> return "DerivedThree";
>   });
>   d1.method2();
>   d2.method2();
>   d3.method2();
>   return 0;
> }
>
> If a debugger stops at method1, the DW_TAG_formal_parameter will tell
> the debugger the type of `this` is Base. Downcasting to the derived
> type is very useful for the programmer though, so both gdb and lldb
> contain a feature to downcast based on the vtable pointer (the "print
> object" and the "target.prefer-dynamic" settings in the respective
> debuggers).
>
> The first part of this is straightforward. The DWARF for Base will
> contain a member for the vtable pointer, and that plus knowledge of
> how the ABI lays out vtables allows the debugger to effectively do a
> dynamic_cast to obtain a pointer to the most derived object.
> From there the vtable address is compared against the ELF symbol table
> to find the mangled name of the vtable symbol.
>
> Then things begin to get hairy, the debugger demangles the mangled
> name that exists in the ELF symbol table, chops off the "vtable for "
> prefix on the demangled name, and searches for the type by name in the
> DWARF. If it finds the type, it adjusts the type of the value and
> prints it accordingly. But this text based matching doesn't always
> work. There are no mangled names for types so the debugger's
> demangling has to match the compiler's output character for character.
>
> In the example program I've provided, when using the respective
> compilers, gdb can successfully downcast DerivedOne and DerivedThree
> but not DerivedTwo. gdb fails because gcc emits the DW_TAG_class_type
> with a DW_AT_name "DerivedTwo >" but libiberty
> demangles the vtable symbol to "vtable for
> DerivedTwo" and those do not match. lldb can only
> successfully downcast DerivedOne. lldb appears to not handle classes
> with template parameters correctly at all. And even if all of that
> were fixed, libiberty and llvm disagree about how to demangle the
> symbol for DerivedTwo's vtable, so the two ecosystems would not be
> interoperable.
>
> Perhaps these are merely quality of implementation issues and belong
> on the respective bug trackers, however, better representations are
> possible. Rustc, for example, does not rely on the ELF symbol table
> and demangled string matching. It emits a global variable in the DWARF
> whose location is the address of the vtable. That variable has a
> DW_AT_type pointing to a DW_TAG_class_type that describes the layout
> of the vtable, and that type has a DW_AT_containing_type that points
> to the type making use of that vtable.
>
> Any thoughts?
>
> - Kyle
> --
> Dwarf-discuss mailing list
> Dwarf-discuss@lists.dwarfstd.org
> https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


[Dwarf-discuss] Representing vtables in DWARF for downcasting

2025-04-23 Thread Kyle Huey via Dwarf-discuss
Consider the following C++ program

#include 

class Base {
public:
  virtual const char* method1() = 0;
  void method2() {
printf("%s\n", method1());
  }
};

class DerivedOne : public Base {
  virtual const char* method1() override {
return "DerivedOne";
  }
};

template
class DerivedTwo : public Base {
public:
  DerivedTwo(T t) : t(t) {}
private:
  virtual const char* method1() override {
return t;
  }
  T t;
};

template
class DerivedThree : public Base {
public:
  DerivedThree(T t) : t(t) {}
private:
  virtual const char* method1() override {
return t();
  }
  T t;
};

int main() {
  DerivedOne d1;
  DerivedTwo d2("DerivedTwo");
  DerivedThree d3([]() {
return "DerivedThree";
  });
  d1.method2();
  d2.method2();
  d3.method2();
  return 0;
}

If a debugger stops at method1, the DW_TAG_formal_parameter will tell
the debugger the type of `this` is Base. Downcasting to the derived
type is very useful for the programmer though, so both gdb and lldb
contain a feature to downcast based on the vtable pointer (the "print
object" and the "target.prefer-dynamic" settings in the respective
debuggers).

The first part of this is straightforward. The DWARF for Base will
contain a member for the vtable pointer, and that plus knowledge of
how the ABI lays out vtables allows the debugger to effectively do a
dynamic_cast to obtain a pointer to the most derived object.
>From there the vtable address is compared against the ELF symbol table
to find the mangled name of the vtable symbol.

Then things begin to get hairy, the debugger demangles the mangled
name that exists in the ELF symbol table, chops off the "vtable for "
prefix on the demangled name, and searches for the type by name in the
DWARF. If it finds the type, it adjusts the type of the value and
prints it accordingly. But this text based matching doesn't always
work. There are no mangled names for types so the debugger's
demangling has to match the compiler's output character for character.

In the example program I've provided, when using the respective
compilers, gdb can successfully downcast DerivedOne and DerivedThree
but not DerivedTwo. gdb fails because gcc emits the DW_TAG_class_type
with a DW_AT_name "DerivedTwo >" but libiberty
demangles the vtable symbol to "vtable for
DerivedTwo" and those do not match. lldb can only
successfully downcast DerivedOne. lldb appears to not handle classes
with template parameters correctly at all. And even if all of that
were fixed, libiberty and llvm disagree about how to demangle the
symbol for DerivedTwo's vtable, so the two ecosystems would not be
interoperable.

Perhaps these are merely quality of implementation issues and belong
on the respective bug trackers, however, better representations are
possible. Rustc, for example, does not rely on the ELF symbol table
and demangled string matching. It emits a global variable in the DWARF
whose location is the address of the vtable. That variable has a
DW_AT_type pointing to a DW_TAG_class_type that describes the layout
of the vtable, and that type has a DW_AT_containing_type that points
to the type making use of that vtable.

Any thoughts?

- Kyle
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] PROPOSAL DW_FORM_implicit_const

2025-04-23 Thread Cary Coutant via Dwarf-discuss
Thanks. I've added this as Issue 250422.1.

https://dwarfstd.org/issues/250422.1.html

Would we want to delete the following paragraph? Quite possibly yes,
there is little reason to suggest repetition is a good idea.

 >If the actual attribute form is itself `DW_FORM_indirect`,
 >the indirection repeats.  There may be one or more
 >occurrences of `DW_FORM_indirect` in sequence until a
 >`non-DW_FORM_indirect` form is reached. The sequence of
 >`DW_FORM_indirect` forms does not have any effect other than
 >to use up space.

I support removing that paragraph. There is no good reason to support a
chain of indirects, and I think it's just asking for trouble.

-cary


On Wed, Apr 23, 2025 at 1:23 PM David Anderson 
wrote:

> On 4/23/25 11:25, Cary Coutant wrote:
> > David,
> >
> > As part of this, we rearrange
> > the references from
> > `DW_FORM_implicit_const`, `DW_FORM_addrx`, and `DW_FORM_indirect`
> > to be listed in the order
> > `DW_FORM_addrx`, `DW_FORM_implicit_const`, and `DW_FORM_indirect`.
> >
> >
> > `DW_FORM_addrx` is not part of the proposal so
> > we keep it separate (just preceding)
> > `DW_FORM_implicit_const` and `DW_FORM_indirect`.
> >
> >
> > Did you mean DW_FORM_addrx_offset where you wrote DW_FORM_addrx here?
> >
> > -cary
>
> Oops. Yes. DW_FORM_addrx_offset.  New in DWARF6.
> DavidA
>
> --
> Space is to place as eternity is to time.
> -- Joseph Joubert
>
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Representing vtables in DWARF for downcasting

2025-04-23 Thread Cary Coutant via Dwarf-discuss
>
> The first part of this is straightforward. The DWARF for Base will
> contain a member for the vtable pointer, and that plus knowledge of
> how the ABI lays out vtables allows the debugger to effectively do a
> dynamic_cast to obtain a pointer to the most derived object.
> From there the vtable address is compared against the ELF symbol table
> to find the mangled name of the vtable symbol.
>

This made me do a bit of research...

The artificial member for the vtable pointer appears to be a DWARF
extension requested as far back as 2003 and implemented in 2009:

   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11208

But I can't find any relevant discussion on the DWARF mailing lists, until
a question arose about that very member in 2022:

   https://dwarfstd.org/pipermail/dwarf-discuss/2022-February/002127.html

It seems to me that, given the apparent need for this information in the
DWARF info, we should have addressed it in DWARF by now. I suspect the
DWARF committee's position was (or would have been) that the ABI tells you
how to find the vtable so it doesn't need to be explicitly recorded in the
DWARF info. But if both GCC and LLVM have decided it's useful enough (and
there's discussion about that point in the original PR that 11208 spun off
from), then we should discuss it. Otherwise, we risk having different
toolchains adopt different solutions. (GCC and LLVM appear to have avoided
that through careful consideration of what the other project was doing.)
The argument in PR 11208 is that it's /legal/ in DWARF to do this, so no
new DWARF feature was requested.

The request in PR 11208 was for three things:

> 1) I'd like to be able to locate the vtable pointer in the class
>structure so that the debugger knows that the hole in the apparent
>layout is not padding.
>
> 2) I'd like to know the type of the target of the vtable pointer, so
>that if the user asks to see it they see something sane.
>
> 3) I'd like to be able to find a specific virtual functions entry in
>the vtable, however I believe that this information will be best
>expressed as a property of the function, not directly of the class
>or vtable. DWARF3 has the DW_AT_vtable_elem_location attribute for
>precisely this information. gcc should generate that too.
>
>Quoting the DWARF spec again :-
>  An entry for a virtual function also has a
>  DW_AT_vtable_elem_location attribute whose value contains a
>  location description yielding the address of the slot for the
>  function within the virtual function table for the enclosing
>  class. The address of an object of the enclosing type is pushed
>  onto the expression stack before the location description is
>  evaluated.

Point #1 is satisfied with an artificial member whose data_member_location
is the offset of the vtable pointer.

I'm not clear how Point #2 was addressed.

Point #3 was addressed via the vtable_elem_location attribute.

Looking at the DWARF generated by GCC (and I'm guessing LLVM does the
same), I see vtable_elem_location attributes that look like this:

<1b8>   DW_AT_vtable_elem_location: 2 byte block: 10 0 (DW_OP_constu: 0)

This is not correct DWARF! It's supposed to be a location description, and
this is merely a DWARF expression that evaluates to an offset relative to
the vtable pointer. The description of the attribute says that address of
an object of the enclosing type is pushed onto the expression stack, so
there really ought to be a DW_OP_deref to get the vtable pointer on the
stack, followed by the DW_OP_constu and DW_OP_add.

Now if we compare this to DW_AT_data_member_location, we see that one valid
form for that attribute is an integer constant providing the offset of the
data member. But even there, if the attribute has a location expression, it
should compute an actual address, not just deliver the offset.

It would seem an obvious and useful extension to DWARF to allow
DW_AT_vtable_elem_location to take a constant class form that provides the
offset relative to the start of the vtable, so an acceptable form of the
attribute might be:

<1b8>   DW_AT_vtable_elem_location: 0   # (using a constant class form)

There's still the question of what do we do about the form GCC is already
emitting (and has been emitting since 2009)? Make it legal? Let it slide
and ask the compilers to fix it?

Getting back to the original request, do we need an issue to provide an
officially-blessed DWARF way to find the vtable pointer? The current
approach seems rather hacky, especially with respect to attaching a special
meaning to the DW_AT_name attribute. Perhaps a DW_AT_vtable_ptr_location
attribute on the class? And maybe a DW_OP_push_vtable_location operator?


> Then things begin to get hairy, the debugger demangles the mangled
> name that exists in the ELF symbol table, chops off the "vtable for "
> prefix on the demangled name, and searches for the type by name in the
> DWARF. If it finds the type, it adjust

Re: [Dwarf-discuss] PROPOSAL DW_FORM_implicit_const

2025-04-23 Thread Cary Coutant via Dwarf-discuss
David,

As part of this, we rearrange
> the references from
> `DW_FORM_implicit_const`, `DW_FORM_addrx`, and `DW_FORM_indirect`
> to be listed in the order
> `DW_FORM_addrx`, `DW_FORM_implicit_const`, and `DW_FORM_indirect`.


`DW_FORM_addrx` is not part of the proposal so
> we keep it separate (just preceding)
> `DW_FORM_implicit_const` and `DW_FORM_indirect`.


Did you mean DW_FORM_addrx_offset where you wrote DW_FORM_addrx here?

-cary
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] PROPOSAL DW_FORM_implicit_const

2025-04-23 Thread David Anderson via Dwarf-discuss

On 4/23/25 11:25, Cary Coutant wrote:

David,

As part of this, we rearrange
the references from
`DW_FORM_implicit_const`, `DW_FORM_addrx`, and `DW_FORM_indirect`
to be listed in the order
`DW_FORM_addrx`, `DW_FORM_implicit_const`, and `DW_FORM_indirect`.


`DW_FORM_addrx` is not part of the proposal so
we keep it separate (just preceding)
`DW_FORM_implicit_const` and `DW_FORM_indirect`.


Did you mean DW_FORM_addrx_offset where you wrote DW_FORM_addrx here?

-cary


Oops. Yes. DW_FORM_addrx_offset.  New in DWARF6.
DavidA

--
Space is to place as eternity is to time.
-- Joseph Joubert
--
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Representing captured `this` in C++ lambdas

2025-04-23 Thread David Blaikie via Dwarf-discuss
On Wed, Apr 23, 2025 at 2:53 PM Kyle Huey  wrote:

> On Wed, Apr 23, 2025 at 2:20 PM David Blaikie  wrote:
> >>
> >> Do you object to anything I proposed other than removing the
> >> representation of the anonymous class compilers generate for lambdas?
> >
> >
> > I'm not a /super/ fan of introducing a bunch of locals in addition to
> the member descriptions - it'll be a bunch of extra DWARF that'd be nice to
> avoid if we can...
>
> Yeah, that would be the reason to get rid of the representation of the
> anonymous struct. Then we're just converting members into locals
> rather than duplicating anything. Even if you really want to keep the
> class itself, the members could be dropped.
>

Except it's likely users will want to inspect the state of a lambda in some
situations. They get passed around, stored (in std::functions or similar
type-erased things, often), etc in many cases & may be important to know
what they represent when not near/in a call to the lambda. So having the
members described seems important.


> > But putting object_pointer on the class member that stores "this" seems
> problematic since that's effectively at the same scope as the real object
> pointer - it'd be awkward to say there's two "this" at the same scope and
> have to say that the member variable "this" shadows the real "this" in some
> way.
> >
> > And then you want the captured variables to be in a scope that is inside
> the "this" scope so they override unqualified lookup for any names that are
> also members of "this"...
> >
> > So, yeah, I get why you/gcc developers arrived where they did. I
> wouldn't mind some size analysis to see how bad the regression/cost is,
> that might help inform whether it's worth trying to address it.
>
> Hmm. I could look at hacking something up to measure but it's not at
> the top of my priority list.
>

Fair.
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Representing vtables in DWARF for downcasting

2025-04-23 Thread Kyle Huey via Dwarf-discuss
On Wed, Apr 23, 2025 at 7:46 PM Cary Coutant  wrote:
>>
>> The first part of this is straightforward. The DWARF for Base will
>> contain a member for the vtable pointer, and that plus knowledge of
>> how the ABI lays out vtables allows the debugger to effectively do a
>> dynamic_cast to obtain a pointer to the most derived object.
>> From there the vtable address is compared against the ELF symbol table
>> to find the mangled name of the vtable symbol.
>
>
> This made me do a bit of research...
>
> The artificial member for the vtable pointer appears to be a DWARF extension 
> requested as far back as 2003 and implemented in 2009:
>
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11208

That was an interesting read. I particularly enjoyed this bit in the
original issue, written in 2002.

"This one isn't really necessary.  The ABI specifies where the vtable
 pointer will be, and GDB is quite capable of using that knowledge to
 identify the runtime type."

And yet here we are 23 years later with examples where gdb et al fail
to correctly identify the runtime type :)

> But I can't find any relevant discussion on the DWARF mailing lists, until a 
> question arose about that very member in 2022:
>
>https://dwarfstd.org/pipermail/dwarf-discuss/2022-February/002127.html
>
> It seems to me that, given the apparent need for this information in the 
> DWARF info, we should have addressed it in DWARF by now. I suspect the DWARF 
> committee's position was (or would have been) that the ABI tells you how to 
> find the vtable so it doesn't need to be explicitly recorded in the DWARF 
> info. But if both GCC and LLVM have decided it's useful enough (and there's 
> discussion about that point in the original PR that 11208 spun off from), 
> then we should discuss it. Otherwise, we risk having different toolchains 
> adopt different solutions. (GCC and LLVM appear to have avoided that through 
> careful consideration of what the other project was doing.) The argument in 
> PR 11208 is that it's /legal/ in DWARF to do this, so no new DWARF feature 
> was requested.
>
> The request in PR 11208 was for three things:
>
> > 1) I'd like to be able to locate the vtable pointer in the class
> >structure so that the debugger knows that the hole in the apparent
> >layout is not padding.
> >
> > 2) I'd like to know the type of the target of the vtable pointer, so
> >that if the user asks to see it they see something sane.
> >
> > 3) I'd like to be able to find a specific virtual functions entry in
> >the vtable, however I believe that this information will be best
> >expressed as a property of the function, not directly of the class
> >or vtable. DWARF3 has the DW_AT_vtable_elem_location attribute for
> >precisely this information. gcc should generate that too.
> >
> >Quoting the DWARF spec again :-
> >  An entry for a virtual function also has a
> >  DW_AT_vtable_elem_location attribute whose value contains a
> >  location description yielding the address of the slot for the
> >  function within the virtual function table for the enclosing
> >  class. The address of an object of the enclosing type is pushed
> >  onto the expression stack before the location description is
> >  evaluated.
>
> Point #1 is satisfied with an artificial member whose data_member_location is 
> the offset of the vtable pointer.

Right.

> I'm not clear how Point #2 was addressed.

I don't think it was.

> Point #3 was addressed via the vtable_elem_location attribute.
>
> Looking at the DWARF generated by GCC (and I'm guessing LLVM does the same), 
> I see vtable_elem_location attributes that look like this:
>
> <1b8>   DW_AT_vtable_elem_location: 2 byte block: 10 0 (DW_OP_constu: 0)
>
> This is not correct DWARF! It's supposed to be a location description, and 
> this is merely a DWARF expression that evaluates to an offset relative to the 
> vtable pointer. The description of the attribute says that address of an 
> object of the enclosing type is pushed onto the expression stack, so there 
> really ought to be a DW_OP_deref to get the vtable pointer on the stack, 
> followed by the DW_OP_constu and DW_OP_add.
>
> Now if we compare this to DW_AT_data_member_location, we see that one valid 
> form for that attribute is an integer constant providing the offset of the 
> data member. But even there, if the attribute has a location expression, it 
> should compute an actual address, not just deliver the offset.
>
> It would seem an obvious and useful extension to DWARF to allow 
> DW_AT_vtable_elem_location to take a constant class form that provides the 
> offset relative to the start of the vtable, so an acceptable form of the 
> attribute might be:
>
> <1b8>   DW_AT_vtable_elem_location: 0   # (using a constant class form)
>
> There's still the question of what do we do about the form GCC is already 
> emitting (and has been emitting since 2009)? Make it legal? Let it slide and 
>