Re: [Dwarf-discuss] Representing vtables in DWARF for downcasting

2025-05-07 Thread Todd Allen via Dwarf-discuss
I think that's orthogonal to the point, which was that a rnglist is 
meant to describe a pc range, not a data range.

BTW, I don't think you need to create the vtable on the fly in that 
case.  The Pkg.Print function will need a static link, but the vtable 
doesn't need to encode that.  I just checked our Ada compiler.  We 
stopped development on this compiler after Ada 95, but the only change I 
had to make to your example to make it Ada 95-friendly, was the call to 
Print: Pkg.Print(Object).  (They were avoiding the Object.Function 
syntax in Ada 95, but I assume they relented and added some syntactic 
sugar in Ada 2005 or 2012.)  Our compiler does indeed generate a static 
vtable.

But if GNAT is generating the vtable at run-time, possibly even on the 
stack (maybe there's some other, more compelling, reason?), then we need 
to make sure the proposal isn't assuming a static location.

On 5/7/25 07:07, Pierre-Marie de Rodat wrote:
> Hello,
>
> On Wed, May 7, 2025 at 2:49 PM Todd Allen via Dwarf-discuss
>  wrote:
>> In 250506.2, the use of a rnglist is throwing me.  I would expect the 
>> lifetime of a vtable to be the whole program.  Or did you envision the 
>> rnglist to be the range of data/rodata addresses of the vtable object?  2.17 
>> clarifies that they're code addresses (i.e. text), though.
>>
>> We did have a discussion sometime in the last year about describing 
>> data/rodata address ranges, but that was in .debug_aranges (RIP).  And, 
>> IIRC, no actual compiler was generating data/rodata address there either.
> If it helps the design: there are languages where vtables are not
> necessarily statically allocated. Here is a small Ada example,
> involving a tagged type (equivalent to a C++ class) nested in a
> procedure, and with a primitive (C++ method) that actually has
> up-level references to the procedure locals (so the vtable is actually
> tied to the current stack frame):
>
>   1  with Ada.Text_IO; use Ada.Text_IO;
>   2
>   3  procedure Main is
>   4 Msg : constant String := "Hello world";
>   5
>   6 package Pkg is
>   7type T is tagged null record;
>   8procedure Print (Self : T);
>   9 end Pkg;
>  10
>  11 package body Pkg is
>  12procedure Print (Self : T) is
>  13begin
>  14   Put_Line (Msg);
>  15end Print;
>  16 end Pkg;
>  17
>  18 Object : Pkg.T;
>  19  begin
>  20 Object.Print;
>  21  end Main;
>
> GDB allows us to observe where the vtable for T is stored (tested on a
> x86_64-linux machine):
>
> $ gdb ./main
> (gdb) b main.adb:20
> Breakpoint 1 at 0x6ae9: file main.adb, line 20.
> (gdb) r
> […]
> Breakpoint 1, main () at main.adb:20
> 20 Object.Print;
> (gdb) set lang c
> Warning: the current language does not match this frame.
> (gdb) print object
> $1 = {_tag = 0x7fffda30}
> (gdb) p $rsp
> $2 = (void *) 0x7fffd7d0
> (gdb) p $rbp
> $3 = (void *) 0x7fffda70
>
> “_tag” is an artificial component for the record T that GCC
> (currently) generates in the debug info to materialize the vtable: it
> points to a structure that is in the current stack frame (between $rsp
> and $rbp).
>
> --
> Pierre-Marie de Rodat 


-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] EXTERNAL: Re: Representing vtables in DWARF for downcasting

2025-05-07 Thread Todd Allen via Dwarf-discuss
BTW, I don't think you need to create the vtable on the fly in that
case.  The Pkg.Print function will need a static link, but the vtable
doesn't need to encode that.  I just checked our Ada compiler.  We
stopped development on this compiler after Ada 95, but the only change I
had to make to your example to make it Ada 95-friendly, was the call to
Print: Pkg.Print(Object).  (They were avoiding the Object.Function
syntax in Ada 95, but I assume they relented and added some syntactic
sugar in Ada 2005 or 2012.)  Our compiler does indeed generate a static
vtable.



Interesting: this indeed works if Main.Pkg.Print calls take a static
link, but how can the caller find the static link to pass in the
general case? This is obvious in my previous example (the call happens
in the same scope that owns the static link), but thanks to type
derivation, calls to Main.Pkg.Print can actually appear in other
places. For instance:

package Base is
   type T is abstract tagged null record;
   procedure Print (Self : T) is abstract;
   procedure Call_Print (Self : T'Class);
   function Get_Msg return String;
end Base;

package body Base is
   procedure Call_Print (Self : T'Class) is
   begin
  Print (Self);
   end Call_Print;

   function Get_Msg return String is
   begin
  return "Hello world";
   end Get_Msg;
end Base;

with Ada.Text_IO; use Ada.Text_IO;
with Base;

procedure Main is
   Msg : constant String := Base.Get_Msg;

   package Pkg is
  type T is new Base.T with null record;
  overriding procedure Print (Self : T);
   end Pkg;

   package body Pkg is
  overriding procedure Print (Self : T) is
  begin
 Put_Line (Msg);
  end Print;
   end Pkg;

   Object : Pkg.T;
begin
   Base.Call_Print (Object);
end Main;

Since Main.Pkg.Print overrides a library-level primitive, it can’t
take a static link so the only way for it to have access to the
Main.Msg local is through Self. I guess a compiler could decide to put
the static link in each Main.Pkg.T object rather than in their vtable
(and thus have a static vtable), but as far as I can tell, GNAT stores
the static link in the vtable instead, so the vtable cannot be static.



That looks like the more compelling reason, then.

I tried this in our Ada 95 compiler, and it's rejected because of accessibility 
rules (13.9.1(3)).  And it looks like a legit rejection.  Those usually are 
designed to avoid dangling references of access types, but they used the same 
concept for type extensions (derived tagged types).  I don't know if the 
rationale was to avoid static link issues specifically, or if that was just a 
happy side effect.  But evidently they loosened the rules in a later language 
revision.

BTW, I sure hope objects of Main.Pkg.T cannot escape the invocation of Main!  
If so, it seems like you're moving into lambda closure territory.

Todd

-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Representing vtables in DWARF for downcasting

2025-05-07 Thread Todd Allen via Dwarf-discuss
In 250506.2, the use of a rnglist is throwing me.  I would expect the lifetime 
of a vtable to be the whole program.  Or did you envision the rnglist to be the 
range of data/rodata addresses of the vtable object?  2.17 clarifies that 
they're code addresses (i.e. text), though.

We did have a discussion sometime in the last year about describing data/rodata 
address ranges, but that was in .debug_aranges (RIP).  And, IIRC, no actual 
compiler was generating data/rodata address there either.

On 5/6/25 19:19, Cary Coutant wrote:
I've written a three-part proposal to address these issues:

  *   The first part, 250506.1, 
proposes a standard mechanism for locating the virtual function table (vtable) 
given an object of a polymorphic class.
  *   The second part, 250506.2, 
proposes a standard mechanism for identifying the most-derived class of an 
object, given its vtable location, in order to support downcasting of pointers 
while debugging.
  *   The third part, 250506.3, 
proposes a fix to the DW_AT_vtable_elem_location attribute, which appears to be 
incorrectly implemented in compilers today.

-cary


On Fri, May 2, 2025 at 1:31 PM Todd Allen via Dwarf-discuss 
mailto:dwarf-discuss@lists.dwarfstd.org>> 
wrote:
FWIW, when we at Concurrent were in the compiler business, our C++ compilers 
generated two vendor-defined attributes, both hanging off the 
DW_TAG_{structure,class}_type.  Here are a couple with some sample locations:
DW_AT_vtable_location [DW_OP_plus_uconst 0; DW_OP_deref]
DW_AT_type_vtable_location [DW_OP_addr 0x12345678]
The first was a description of how to obtain the address of the vtable tag from 
an object.
The second was a description of the address of the vtable tag from just the 
type.

As we characterized them internally, they didn't have to be the address of the 
vtable proper.  They just had to be something that could be compared as a 
positive identification of the actual type.  I believe they always were the 
actual vtable addresses, though.  Because why not?

We do still have logic in our debugger to use them, too.  In addition to the 
mangling-based approaches.

It does require walking the whole DWARF tree to find them.

Todd

On 4/25/25 09:49, Jeremy Morse via Dwarf-discuss wrote:
Hi all,

The LLVM discussion linked [0] happens to be us Sony folks, and it's supporting 
the use-case Kyle described of automatic downcasting, i.e. identifying the 
most-derived-class of an object from its vtable pointer. Having to demangle the 
symbol table is a real pain (Tom, CC'd knows more) especially with things like 
anonymous namespaces.

Right now the approach is to have a top-level nameless global variable with the 
location set to the vtable address, and a DW_AT_specification linking into the 
class definition:

0x0082:   DW_TAG_variable
DW_AT_specification (0x00b6 "_vtable$")
DW_AT_alignment (8)
DW_AT_location  (DW_OP_addrx 0x1)

[Then deeper into the DIE tree,]

0x008b:   DW_TAG_structure_type
DW_AT_containing_type   (0x0034 "CBase")
DW_AT_calling_convention(DW_CC_pass_by_reference)
DW_AT_name  ("CDerived")
DW_AT_decl_file ("vtables.cpp")
DW_AT_decl_line (6)

[...]

0x00b6: DW_TAG_variable
  DW_AT_name("_vtable$")
  DW_AT_type(0x0081 "void *")
  DW_AT_external(true)
  DW_AT_declaration (true)
  DW_AT_artificial  (true)
  DW_AT_accessibility   (DW_ACCESS_private)

This works well enough for our own debugger use-cases; I agree with Cary that 
it's hacky to rely on the name of a variable to signify important information 
like this and an officially blessed way could help.

I've no opinion on the  DW_AT_vtable_elem_location behaviours, although we can 
consider it a separate issue.

[0] https://github.com/llvm/llvm-project/pull/130255

--
Thanks,
Jeremy





--
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss

-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Representing vtables in DWARF for downcasting

2025-05-07 Thread Ben Woodard via Dwarf-discuss
Can unions have virtual functions and thus vtables? My understanding is 
that unions can have member functions but I can't see how they can have 
virtual functions? What function would you call? Where would you put the 
vtable pointer?


This article on C++ reference suggests that they cannot have virtual 
functions at the source level 
https://en.cppreference.com/w/cpp/language/union#:~:text=A%20union%20can%20have%20member,have%20a%20default%20member%20initializer. 
and I can't see how a union with a vtable could be implemented.


So I think "A structure, union, or class type may have a 
DW_AT_vtable_location attribute," should only be "A structure, or class..."


-ben

On 5/6/25 6:19 PM, Cary Coutant via Dwarf-discuss wrote:

I've written a three-part proposal to address these issues:

  * The first part,250506.1
, proposes a standard
mechanism for locating the virtual function table (vtable) given
an object of a polymorphic class.
  * The second part,250506.2
, proposes a standard
mechanism for identifying the most-derived class of an object,
given its vtable location, in order to support downcasting of
pointers while debugging.
  * The third part,250506.3
, proposes a fix to the
|DW_AT_vtable_elem_location|attribute, which appears to be
incorrectly implemented in compilers today.

-cary


On Fri, May 2, 2025 at 1:31 PM Todd Allen via Dwarf-discuss 
 wrote:


FWIW, when we at Concurrent were in the compiler business, our C++
compilers generated two vendor-defined attributes, both hanging
off the DW_TAG_{structure,class}_type.  Here are a couple with
some sample locations:

DW_AT_vtable_location [DW_OP_plus_uconst 0; DW_OP_deref]
DW_AT_type_vtable_location [DW_OP_addr 0x12345678]

The first was a description of how to obtain the address of the
vtable tag from an object.
The second was a description of the address of the vtable tag from
just the type.

As we characterized them internally, they didn't have to be the
address of the vtable proper.  They just had to be something that
could be compared as a positive identification of the actual
type.  I believe they always were the actual vtable addresses,
though. Because why not?

We do still have logic in our debugger to use them, too.  In
addition to the mangling-based approaches.

It does require walking the whole DWARF tree to find them.

Todd

On 4/25/25 09:49, Jeremy Morse via Dwarf-discuss wrote:

Hi all,

The LLVM discussion linked [0] happens to be us Sony folks, and
it's supporting the use-case Kyle described of automatic
downcasting, i.e. identifying the most-derived-class of an object
from its vtable pointer. Having to demangle the symbol table is a
real pain (Tom, CC'd knows more) especially with things like
anonymous namespaces.

Right now the approach is to have a top-level nameless global
variable with the location set to the vtable address, and a
DW_AT_specification linking into the class definition:

0x0082:   DW_TAG_variable
DW_AT_specification     (0x00b6 "_vtable$")
DW_AT_alignment (8)
DW_AT_location  (DW_OP_addrx 0x1)

[Then deeper into the DIE tree,]

0x008b: DW_TAG_structure_type
                DW_AT_containing_type (0x0034 "CBase")
                DW_AT_calling_convention  (DW_CC_pass_by_reference)
                DW_AT_name      ("CDerived")
                DW_AT_decl_file ("vtables.cpp")
                DW_AT_decl_line (6)

                [...]

0x00b6:     DW_TAG_variable
                  DW_AT_name    ("_vtable$")
                  DW_AT_type    (0x0081 "void *")
                  DW_AT_external        (true)
                  DW_AT_declaration     (true)
                  DW_AT_artificial      (true)
                  DW_AT_accessibility (DW_ACCESS_private)

This works well enough for our own debugger use-cases; I agree
with Cary that it's hacky to rely on the name of a variable to
signify important information like this and an officially blessed
way could help.

I've no opinion on the DW_AT_vtable_elem_location behaviours,
although we can consider it a separate issue.

[0] https://github.com/llvm/llvm-project/pull/130255

--
Thanks,
Jeremy




-- 
Dwarf-discuss mailing list

Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss

-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Representing vtables in DWARF for downcasting

2025-05-07 Thread Pierre-Marie de Rodat via Dwarf-discuss
Hello,

On Wed, May 7, 2025 at 2:49 PM Todd Allen via Dwarf-discuss
 wrote:
> In 250506.2, the use of a rnglist is throwing me.  I would expect the 
> lifetime of a vtable to be the whole program.  Or did you envision the 
> rnglist to be the range of data/rodata addresses of the vtable object?  2.17 
> clarifies that they're code addresses (i.e. text), though.
>
> We did have a discussion sometime in the last year about describing 
> data/rodata address ranges, but that was in .debug_aranges (RIP).  And, IIRC, 
> no actual compiler was generating data/rodata address there either.

If it helps the design: there are languages where vtables are not
necessarily statically allocated. Here is a small Ada example,
involving a tagged type (equivalent to a C++ class) nested in a
procedure, and with a primitive (C++ method) that actually has
up-level references to the procedure locals (so the vtable is actually
tied to the current stack frame):

 1  with Ada.Text_IO; use Ada.Text_IO;
 2
 3  procedure Main is
 4 Msg : constant String := "Hello world";
 5
 6 package Pkg is
 7type T is tagged null record;
 8procedure Print (Self : T);
 9 end Pkg;
10
11 package body Pkg is
12procedure Print (Self : T) is
13begin
14   Put_Line (Msg);
15end Print;
16 end Pkg;
17
18 Object : Pkg.T;
19  begin
20 Object.Print;
21  end Main;

GDB allows us to observe where the vtable for T is stored (tested on a
x86_64-linux machine):

$ gdb ./main
(gdb) b main.adb:20
Breakpoint 1 at 0x6ae9: file main.adb, line 20.
(gdb) r
[…]
Breakpoint 1, main () at main.adb:20
20 Object.Print;
(gdb) set lang c
Warning: the current language does not match this frame.
(gdb) print object
$1 = {_tag = 0x7fffda30}
(gdb) p $rsp
$2 = (void *) 0x7fffd7d0
(gdb) p $rbp
$3 = (void *) 0x7fffda70

“_tag” is an artificial component for the record T that GCC
(currently) generates in the debug info to materialize the vtable: it
points to a structure that is in the current stack frame (between $rsp
and $rbp).

-- 
Pierre-Marie de Rodat 
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Representing vtables in DWARF for downcasting

2025-05-07 Thread Pierre-Marie de Rodat via Dwarf-discuss
On Wed, May 7, 2025 at 4:11 PM Todd Allen  wrote:
> I think that's orthogonal to the point, which was that a rnglist is
> meant to describe a pc range, not a data range.

Got it; I did not get this was your point, I was mainly reacting to
the part of your message “I would expect the lifetime of a vtable to
be the whole program.”.

> BTW, I don't think you need to create the vtable on the fly in that
> case.  The Pkg.Print function will need a static link, but the vtable
> doesn't need to encode that.  I just checked our Ada compiler.  We
> stopped development on this compiler after Ada 95, but the only change I
> had to make to your example to make it Ada 95-friendly, was the call to
> Print: Pkg.Print(Object).  (They were avoiding the Object.Function
> syntax in Ada 95, but I assume they relented and added some syntactic
> sugar in Ada 2005 or 2012.)  Our compiler does indeed generate a static
> vtable.

Interesting: this indeed works if Main.Pkg.Print calls take a static
link, but how can the caller find the static link to pass in the
general case? This is obvious in my previous example (the call happens
in the same scope that owns the static link), but thanks to type
derivation, calls to Main.Pkg.Print can actually appear in other
places. For instance:

package Base is
   type T is abstract tagged null record;
   procedure Print (Self : T) is abstract;
   procedure Call_Print (Self : T'Class);
   function Get_Msg return String;
end Base;

package body Base is
   procedure Call_Print (Self : T'Class) is
   begin
  Print (Self);
   end Call_Print;

   function Get_Msg return String is
   begin
  return "Hello world";
   end Get_Msg;
end Base;

with Ada.Text_IO; use Ada.Text_IO;
with Base;

procedure Main is
   Msg : constant String := Base.Get_Msg;

   package Pkg is
  type T is new Base.T with null record;
  overriding procedure Print (Self : T);
   end Pkg;

   package body Pkg is
  overriding procedure Print (Self : T) is
  begin
 Put_Line (Msg);
  end Print;
   end Pkg;

   Object : Pkg.T;
begin
   Base.Call_Print (Object);
end Main;

Since Main.Pkg.Print overrides a library-level primitive, it can’t
take a static link so the only way for it to have access to the
Main.Msg local is through Self. I guess a compiler could decide to put
the static link in each Main.Pkg.T object rather than in their vtable
(and thus have a static vtable), but as far as I can tell, GNAT stores
the static link in the vtable instead, so the vtable cannot be static.

> But if GNAT is generating the vtable at run-time, possibly even on the
> stack (maybe there's some other, more compelling, reason?), then we need
> to make sure the proposal isn't assuming a static location.

Agreed.

-- 
Pierre-Marie de Rodat 
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Representing vtables in DWARF for downcasting

2025-05-07 Thread Cary Coutant via Dwarf-discuss
>
> In 250506.2, the use of a rnglist is throwing me.  I would expect the
> lifetime of a vtable to be the whole program.  Or did you envision the
> rnglist to be the range of data/rodata addresses of the vtable object?
> 2.17 clarifies that they're code addresses (i.e. text), though.
>

The DW_AT_vtable_ranges attribute is for the CU, and gives a set of ranges
of memory addresses for the vtables in that CU, not just one. (This is not
a loclist, but a rnglist.) Typically (at least in C++), vtables are emitted
in a specific section (where they can be relocated then made read-only),
and could be described by a single range list entry. With COMDATs, it might
take more than one range. I thought about adding DW_AT_vtable_low/high as
an alternative to the range list, but it seemed like overkill.

I may have to tweak the wording in 2.17 to cover the usage of range lists
for vtables. I think 2.17.3, "Non-Contiguous Address Ranges," is generic
enough to apply to vtable addresses as well as code addresses.

If it helps the design: there are languages where vtables are not
> necessarily statically allocated.


For vtables that are generated dynamically, I don't know that there's
anything we can do to help with accelerated access.

-cary
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss