[Dwarf-discuss] [Proposal] Allow alternate encoding of DW_AT_object_pointer as a variable index instead of DIE reference

2025-01-30 Thread Michael Buch via Dwarf-discuss
# Allow alternate encoding of DW_AT_object_pointer as a variable index
instead of DIE reference

## Background

`DW_AT_object_pointer` is used by LLDB to conveniently determine the
CV-qualifiers
and storage class of C++ member functions when reconstructing types from DWARF.
GCC currently emits `DW_AT_object_pointer` on both declaration and
definition DIEs [1].
Clang does not emit them on declarations, making the LLDB heuristics
to find the object
parameter fragile. We tried attaching `DW_AT_object_pointer` to
declarations in Clang
too [2], but that came at the cost of a ~5-10% increase in the
`.debug_info` section size
for some users, so we reverted it. This proposal describes an
alternate encoding of the
`DW_AT_object_pointer` which allows us to add it to declaration DIEs without
incurring such size overheads.

## Overview

The idea is to encode the index of the `DW_TAG_formal_parameter` that is the
object parameter instead of a DIE reference. This index could then be of form
`DW_FORM_implicit_const`, so we don't pay the 4 bytes for each reference,
but instead pay for it once in the abbreviation.

The implementation in Clang for this is currently being discussed in [3].

The DWARF spec currently only mentions `reference` as the attribute class
of `DW_AT_object_pointer`. So consumers may be surprised by this alternate
encoding. Hence we thought it'd be good to run this past the committee.

An alternative solution could be a new attribute describing the object
parameter index (e.g., `DW_AT_object_pointer_index` with a `constant`
attribute class).

## Proposed Changes

In chapter "7.5.4 Attribute Encodings", change the "Table 7.5:
Attribute encodings"
table as follows:

[ORIGINAL TEXT]
>>>
Attribute Name | Value | Classes
---
...
DW_AT_object_pointer| 0x64  | reference
...
[NEW TEXT]
==
Attribute Name | Value | Classes
---
...
DW_AT_object_pointer| 0x64  | reference, constant
...
<<<

In chapter "5.7.8 Member Function Entries", extend the attribute class
recommendation
as follows:

[ORIGINAL TEXT]
>>>
If the member function entry describes a non-static member function, then that
entry has a DW_AT_object_pointer attribute whose value is a reference to the
formal parameter entry that corresponds to the object for which the function is
called.
[NEW TEXT]
==
If the member function entry describes a non-static member function, then that
entry has a DW_AT_object_pointer attribute whose value is a reference to the
formal parameter entry that corresponds to the object for which the function is
called. A producer may also choose to represent it as a constant whose value is
the zero-based index of the formal parameter that corresponds to the object
parameter.
<<<

## References

* [1]: https://godbolt.org/z/3TWjTfWon
* [2]: https://github.com/llvm/llvm-project/pull/122742
* [3]: https://github.com/llvm/llvm-project/pull/124790
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


[Dwarf-discuss] [Proposal] DW_AT_object_pointer: clarify wording around implicit versus explicit object parameters

2025-01-22 Thread Michael Buch via Dwarf-discuss
# DW_AT_object_pointer: clarify wording around implicit versus
explicit object parameters

## Background

With C++23 we got the ability to explicitly spell out in source the
object parameter of a class method [1]. The object parameter for such
methods is not compiler-generated and is explicitly named by the user.
The wording of the current DWARF spec assumes object parameters are
implicit and specifies that those parameters be marked
DW_AT_artificial, despite them not falling into the category for
"artificial" parameters.

Other examples of languages with explicit object parameters are Python and Rust.

Recently Clang started emitting DW_AT_object_pointer for explicit
object parameters but decided not to mark them artificial [2][3].

## Overview

This proposal adjusts the wording in the DWARF spec to clarify that
explicit DW_AT_object_parameter's might not be marked
DW_AT_artificial.

## Proposed Changes

In chapter "5.7.8 Member Function Entries", change the mention of
`DW_AT_artificial` in the last sentence as follows:

[ORIGINAL TEXT]
>>>
If the member function entry describes a non-static member function,
then that entry has a DW_AT_object_pointer attribute whose value is a
reference to the formal parameter entry that corresponds to the object
for which the function is called. The name attribute of that formal
parameter is defined by the current language (for example, this for
C++ or self for Objective C and some other languages). That parameter
also has a DW_AT_artificial attribute whose value is true.
[NEW TEXT]
==
If the member function entry describes a non-static member function,
then that entry has a DW_AT_object_pointer attribute whose value is a
reference to the formal parameter entry that corresponds to the object
for which the function is called. The name attribute of that formal
parameter is defined by the current language (for example, this for
C++ or self for Objective C and some other languages). Many languages
make the object pointer an implicit parameter with no syntax. In that
case the parameter should have a DW_AT_artificial attribute whose
value is true.
<<<

Alternatively we could just omit any mention of DW_AT_artificial. And
we might want to add some informative text about C++23's explicit
object parameters.

## References

* [1]: 
https://en.cppreference.com/w/cpp/language/member_functions#Explicit_object_member_functions
* [2]: https://github.com/llvm/llvm-project/pull/122897
* [3]: https://github.com/llvm/llvm-project/pull/122928
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] [Proposal] DW_AT_object_pointer: clarify wording around implicit versus explicit object parameters

2025-01-23 Thread Michael Buch via Dwarf-discuss
Thanks for filing

Mind updating the issue with the latest proposal I attached earlier? I.e.,
we’ll just remove the wording in the current spec

Many thanks,
Michael

On Thu, Jan 23, 2025 at 20:35 Cary Coutant  wrote:

> On Wed, Jan 22, 2025 at 2:54 AM Michael Buch via Dwarf-discuss <
> dwarf-discuss@lists.dwarfstd.org> wrote:
>
>> # DW_AT_object_pointer: clarify wording around implicit versus
>> explicit object parameters
>
>
> Filed as Issue 250122.1:
>
> https://dwarfstd.org/issues/250122.1.html
>
> -cary
>
>
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Representing vtables in DWARF for downcasting

2025-04-23 Thread Michael Buch via Dwarf-discuss
Sounds like this is what
https://github.com/llvm/llvm-project/pull/130255 is trying to achieve?
If we could simplify that part of LLDB that'd be great!

On Wed, 23 Apr 2025 at 16:32, Kyle Huey via Dwarf-discuss
 wrote:
>
> Consider the following C++ program
>
> #include 
>
> class Base {
> public:
>   virtual const char* method1() = 0;
>   void method2() {
> printf("%s\n", method1());
>   }
> };
>
> class DerivedOne : public Base {
>   virtual const char* method1() override {
> return "DerivedOne";
>   }
> };
>
> template
> class DerivedTwo : public Base {
> public:
>   DerivedTwo(T t) : t(t) {}
> private:
>   virtual const char* method1() override {
> return t;
>   }
>   T t;
> };
>
> template
> class DerivedThree : public Base {
> public:
>   DerivedThree(T t) : t(t) {}
> private:
>   virtual const char* method1() override {
> return t();
>   }
>   T t;
> };
>
> int main() {
>   DerivedOne d1;
>   DerivedTwo d2("DerivedTwo");
>   DerivedThree d3([]() {
> return "DerivedThree";
>   });
>   d1.method2();
>   d2.method2();
>   d3.method2();
>   return 0;
> }
>
> If a debugger stops at method1, the DW_TAG_formal_parameter will tell
> the debugger the type of `this` is Base. Downcasting to the derived
> type is very useful for the programmer though, so both gdb and lldb
> contain a feature to downcast based on the vtable pointer (the "print
> object" and the "target.prefer-dynamic" settings in the respective
> debuggers).
>
> The first part of this is straightforward. The DWARF for Base will
> contain a member for the vtable pointer, and that plus knowledge of
> how the ABI lays out vtables allows the debugger to effectively do a
> dynamic_cast to obtain a pointer to the most derived object.
> From there the vtable address is compared against the ELF symbol table
> to find the mangled name of the vtable symbol.
>
> Then things begin to get hairy, the debugger demangles the mangled
> name that exists in the ELF symbol table, chops off the "vtable for "
> prefix on the demangled name, and searches for the type by name in the
> DWARF. If it finds the type, it adjusts the type of the value and
> prints it accordingly. But this text based matching doesn't always
> work. There are no mangled names for types so the debugger's
> demangling has to match the compiler's output character for character.
>
> In the example program I've provided, when using the respective
> compilers, gdb can successfully downcast DerivedOne and DerivedThree
> but not DerivedTwo. gdb fails because gcc emits the DW_TAG_class_type
> with a DW_AT_name "DerivedTwo >" but libiberty
> demangles the vtable symbol to "vtable for
> DerivedTwo" and those do not match. lldb can only
> successfully downcast DerivedOne. lldb appears to not handle classes
> with template parameters correctly at all. And even if all of that
> were fixed, libiberty and llvm disagree about how to demangle the
> symbol for DerivedTwo's vtable, so the two ecosystems would not be
> interoperable.
>
> Perhaps these are merely quality of implementation issues and belong
> on the respective bug trackers, however, better representations are
> possible. Rustc, for example, does not rely on the ELF symbol table
> and demangled string matching. It emits a global variable in the DWARF
> whose location is the address of the vtable. That variable has a
> DW_AT_type pointing to a DW_TAG_class_type that describes the layout
> of the vtable, and that type has a DW_AT_containing_type that points
> to the type making use of that vtable.
>
> Any thoughts?
>
> - Kyle
> --
> Dwarf-discuss mailing list
> Dwarf-discuss@lists.dwarfstd.org
> https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss