Re: [Dwarf-Discuss] About self-referencial sized types

2014-05-27 Thread Todd Allen
}_bound attributes. The 
> DWARFv4 specification (Appendix D, subsection 2.2 Ada Example) suggests 
> the following DIEs (I'm stripping a few attributes that are not relevant 
> for this issue):
> 
>   1$: DW_TAG_structure_type
>   DW_AT_name("Record_Type")
> 
>   2$:   DW_TAG_member
> DW_AT_name("N")
> DW_AT_type(reference to Integer)
> 
>   3$:   DW_TAG_array_type
> DW_AT_type(reference to Integer)
> 
>   4$: DW_TAG_subrange_type
>   DW_AT_type(reference to Integer)
>   DW_AT_lower_bound(constant 1)
>   DW_AT_upper_bound(reference to member N at 2$)
> 
>   5$:   DW_TAG_member
> DW_AT_name("A")
> DW_AT_type(reference to array type at 4$)
> 
> With this debug info, the upper bound of "A" indeed completely mirrors 
> the value of "N". In GCC, however, computing the upper bound of "A" is 
> more subtle: it is internally represented as: max(0, .N) so that 
> when "N" is negative, 0 is returned.
> 
> While it is straightforward to reference a DIE from the 
> DW_AT_upper_bound attribute, I struggle doing so inside a DWARF 
> expression, and I do need a DWARF expression to correctly describe the 
> computation of the upper bound. I guess I need an operation sequence 
> that looks like:
> 
> # Push N, then 0
> ??? Get the value of the "N" member;
> DW_OP_lit0;
> 
> # Is N > 0?
> DW_OP_over; DW_OP_over;
> DW_OP_gt;
> DW_OP_bra: 1;
> 
> # If not then return 0, else return N.
> DW_OP_swap
> DW_OP_drop
> 
> So the issue for me is to know what to put instead of the "???" part. It 
> looks like the DW_OP_push_object_address (defined in section 2.5.1.3 
> Stack Operations) was introduced specifically for this kind of 
> computation, but I'm not sure what it is supposed to mean in this 
> context. Indeed, this operation would appear as part of a DWARF 
> expression under a DW_TAG_subrange_type DIE, itself under a 
> DW_TAG_array_type DIE, itself under a DW_TAG_structure_type. So what 
> address would this operation push on top of the stack? The address of 
> the "A" member, or the address of the embedding record?
> 
> The offsets of discriminants (the special record members that can be 
> used to determine the size of regular record members) inside the record 
> are statically known, so getting the address of the embedding record 
> would be enough to be able to fetch the value of the discriminant. On 
> the other hand, getting the address of the "A" member would not be 
> sufficient: in more complex cases, the offset of the "A" member can 
> depend on discriminants!
> 
> I tried to look at the implementation of DW_OP_push_object_address in 
> GDB, but it looks like it's not implemented yet. What do you think about 
> its expected behavior? And if I cannot use this operation for such array 
> bound expressions, what should I use?
> 
> Thank you in advance for your answers. :-)
> 
> -- 
> Pierre-Marie de Rodat
> 
> ___
> Dwarf-Discuss mailing list
> Dwarf-Discuss@lists.dwarfstd.org
> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org

-- 
Todd Allen
Concurrent Computer Corporation
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] How to represent address space information in DWARF

2016-07-28 Thread Todd Allen
On Wed, Jul 27, 2016 at 07:39:54PM -0400, Tye, Tony wrote:
>Another question that has been raised as part of the HSA Foundation
>([1]http://www.hsafoundation.com/) tools working group relates to the
>manner that address spaces should be represented in DWARF.
> 
> 
> 
>HSA defines segments in which variables can be allocated. These are
>basically the same as the address spaces of OpenCL. HSA defines kernels
>that are basically the same as OpenCL kernels. A kernel is a grid launch
>of separate threads of execution (termed work-items). These work-items are
>grouped into work-groups. The work-items can access one of three main
>memory segments:
> 
> 
> 
>1. The global segment is accessible by all work-items. In hardware it is
>typically just the global memory.
> 
>2. The group segment (corresponding to the local address space of OpenCL)
>is accessible only to the work-items in the same work-group. Each
>work-group has its own copy of variables allocated in the group segment.
>On GPU hardware this can be implemented as special hardware managed
>scratch pad memory (not part of globally addressable memory), with special
>hardware instructions to access it.
> 
>3. The private segment is accessible to a single work-item. Each work-item
>has its own copy of variables allocated in the private segment. On GPU
>hardware this could also involve special hardware instructions.
> 
> 
> 
>HSA also defines the concept of a flat address (similar to OpenCL generic
>addresses). It is essentially a linearization of the addresses of the 3
>address spaces. For example, one range of a flat address maps to the group
>segment, another range maps to the private segment, and the rest map
>directly to the global segment. However, it is target specific what exact
>method is used to achieve the linearization.
> 
> 
> 
>The following was the conclusion we reached from reading the DWARF
>standard and looking at how gdb and lldb would use the information. We are
>currently working on creating a patch for LLVM to support address spaces
>and would appreciate any feedback on if this matches the intended usage of
>DWARF features to support this style of address space.
> 
> 
> 
>1. Use the DW_AT_address_class to specify that the value of a pointer-like
>value is the address within a specific address space. Pointer-like values
>include pointers, references, functions and function types. For HSA we are
>really only concerned with pointer/reference values currently. It would
>apply to a pointer-like type DIE, or a variable with a pointer-like type.
>In the case of a variable it does not specify the address space of the
>variable's location, but specifies how to treat the address value stored
>in the variable.
> 
> 
> 
>2. Use DW_OP_xderef in the location expression of a variable to specify
>the address space in which the variable is located. Since location
>expressions can specify different locations depending on the PC, this
>allows the variable to be optimized to have multiple locations. For
>example, sometimes in a memory location in the group address space,
>sometimes in a register, sometimes in a memory location in the private
>address space (maybe due to spilling of the register), etc.
> 
> 
> 
>Attempting to use DW_AT_address_class on variables to specify their
>address space location conflicts with DWARF stating that it applies to the
>pointee as described in #1. It also breaks the flexibility of location
>expressions allowing the location to change according to PC.
> 
> 
> 
>When a debugger evaluates a DWARF location expression it can generate a
>flat address to encode the address space. It can do this by implementing
>the XDEREF as a target specific conversion from a segment address into a
>flat address. Similarly when using a value as an address that has a
>pointer-like type with an address class, the value can be converted to a
>flat address. When accessing addresses the debugger would have to provide
>the "current thread" so that the correct group/private address space
>instance can be accessed when given a flat address that maps to the group
>or private segments. It appears both gdb and lldb provide this.
> 

FWIW, the use of DW_AT_address_class on pointer & reference times also is how
Nvidia's CUDA compiler describes its various segments.  I haven't encountered
any uses of DW_OP_xderef* operators, but it probably is just because it hasn't
been necessary.

-- 
Todd Allen
Concurrent Computer Corporation
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] Some DWARFv5 draft feedback

2016-12-01 Thread Todd Allen
> 
> Enumeration types. It is allowed to have a DW_AT_byte_size on a
> DW_TAG_enumeration_type, but not DW_AT_encoding. To describe both size
> and encoding one needs to use a DW_AT_type pointing to a base type that
> represents the "underlying type". For languages where enumerations don't
> have an underlying type, or for strongly typed enums it is easier to
> attach the encoding directly than adding and indirection to a base type.
> Add DW_AT_encoding to the attribute list for DW_TAG_enumeration_type.
> 

FWIW, our Ada compiler always placed DW_AT_encoding attributes directly on
DW_TAG_enumeration_type to indicate the underlying signedness.  Ada never was
truly supported, so we just viewed it as part of the Ada support we'd defined.

-- 
Todd Allen
Concurrent Computer Corporation
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] DW_AT_decl_file etc understood now. Question answered.

2020-02-19 Thread Todd Allen via Dwarf-Discuss
David,

DIRECTORIES:

The situation for directories seems pretty clear to me, and much the same as how
Philip saw it (I think).  From 6.2.4, item 16 (page 157, lines 4-11):

   The line number program assigns a number (index) to each of the directory
   entries in order, beginning with 0.

   Prior to DWARF Version 5, the current directory was not represented in the
   directories field and a directory index of 0 implicitly referred to that 
directory as found
   in the DW_AT_comp_dir attribute of the compilation unit debugging information
   entry. In DWARF Version 5, the current directory is explicitly present in the
   directories field. This is needed to support the common practice of 
stripping all but
   the line number sections ( .debug_line and .debug_line_str ) from an 
executable.

That seems pretty clearly to be a base of 0.

My understanding:

   For DWARF 2,3,4, the directory table starts with 1; 0 meant
   "go use DW_AT_comp_dir".

   For DWARF 5, the directory table starts with 0.  And presumably the 0th index
   is essentially the same as DW_AT_comp_dir.

FILES:

The case for the file table being base 1 seems strong.

-- 
Todd Allen
Concurrent Real-Time

On Tue, Feb 18, 2020 at 10:25:21AM -0800, Dwarf Discussion wrote:
> February 18, 2020.
> 
> Lets see if we can get a complete picture of the
> indexing into the line table header directory/file
> names tables.
> 
> The conjecture at the end of this suggests that
> all places with an actual file/directory number
> or index should be 1-origin, reserving 0 to
> mean no file named.
> 
> The parts of DWARF5 that seem to be a complete picture
> of the line table file/directory numbers are;
> 
> A) DW_AT_decl_file
> A) DW_AT_call_file  /* DWARF5 */
> C) DW_LNS_set_file
> D) The current compilation file name
> E) The current compilation directory name
> F) The line table 'file' register.
> 
> G) The array of file names in the line table header
>    (called the line table prologue in DWARF2).
> 
> H) The array of directory names in the line table header.
> 
> J) DW_MACRO_start_file (DW5)
> 
> 
> DWARF5: the first entry in the directory table file names array
>    is the name of the current compilation unit directory
>    (the same as DW_AT_comp_dir in the CU die)
> 
> DWARF2,3,4: the first entry in the directory table file names array
>    is the name of some directory, not necessarily
>    the same as DW_AT_comp_dir in the CU die
> 
> DWARF5: the first entry in the line table file names array
>    is the name of the current compilation unit (the same as
>    DW_AT_name in the CU die).
> 
> DWARF2,3,4
>    the first entry in the line table file names array
>    does not have the current compilation unit file name,
>    that name is only in DW_AT_name in the CU die.
> 
> Where DW_AT_decl_file is defined the standard(s)
> indicate 0 is reserved to mean no file is specified
> (Section 2.14, Declaration Coordinates)
> That's the only place where zero is explictly
> mentioned .
> 
> DW_LNS_set_file is defined to set the 'file' register
> in the line state machine.
> In all versions of DWARF the default value of the 'file'
> register is 1 (not zero).
> 
> DW_MACRO_start_file (DW5):
> The source file name index is the file number in the line number information
> table for the compilation unit.
> 
> While the standard sometimes says file number and sometimes says 'index'
> this note regards them as the same thing with regard to the line table
> header arrays..
> 
> In DWARF 2,3,4 1 refers to a directory or file entry but not the
> compilation unit
> directory/file.
> 
> In DWARF 2,3,4 1 refers to a directory or file entry applicable to the
> compilation unit.
> 
> CONJECTURE:
> In all cases the index or array references are intended to be 1-origin.
> 
> In DWARF 2,3,4 1 refers to a directory or file entry but not the
> compilation unit
> directory/file.
> 
> In DWARF 2,3,4 1 refers to a directory or file entry applicable to the
> compilation unit.
> 
> Taking this approach seems to resolve all issues pretty well.
> ===
> 
> ___
> Dwarf-Discuss mailing list
> Dwarf-Discuss@lists.dwarfstd.org
> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] Use of Location Description operations in DWARF Expressions?

2020-03-23 Thread Todd Allen via Dwarf-Discuss
I recall this being intentional as well.  This is how I think of these items.
And this is just the gist of things.  I didn't put on my ABI Lawyer hat for
this:

A DWARF expression is a stack machine that evaluates to a value.

A location description describes the "location" of an object.  A "location" is
pure concept here, and doesn't necessarily require any physical location.  It
can be:
 * In memory at an address: It has a DWARF expression which computes the
   start address.
 * In a register: DW_OP_regN, DW_OP_regx.  No locdesc needed.
 * Nowhere, but with a known value: DW_OP_implicit_value, DW_OP_stack_value.
   I think of this as being an ephemeral "location", which in concrete terms
   would be in a buffer in the debugger or some other consumer.  There's a
   DWARF expression which computes the value.
 * Nowhere, but where it's an optimized-away pointer, and its designated
   (pointed-to) value is a known value, much like above.
 * Spread out across multiple distinct locations: DW_OP_piece's, where each
   piece can be any one of the above.
Oh, and one more:
 * Nowhere at all.  Go fish.

So locdescs can use a DWARF expression for a couple different purposes, or even
multiple DWARF expressions.  But they have additional operators for additional
cases (e.g. registers), and for "glue" (DW_OP_piece).

Conversely, DWARF expressions cannot use any of the locdesc special operators or
"glue".

But, of course, there could be use cases where some of these would make sense in
a DWARF expression, and we just didn't think of it.  Nothing springs to my mind
right now...  But if you have a compelling case, we certainly could move some of
those special/glue operators from the locdesc category to the DWARF expression
category.

It think it feels a little blurry only because locdescs came first, and then we
co-opted them for DWARF expressions, and restricted the use of certain operators
in that case.  And then it eventually changed into what we have now.  But a lot
of us remember the history, which creates that blur.

-- 
Todd Allen
Concurrent Real-Time

On Mon, Mar 23, 2020 at 12:04:58PM -0700, Dwarf Discussion wrote:
> On 3/23/20 6:28 AM, Robinson, Paul via Dwarf-Discuss wrote:
> > > From: Dwarf-Discuss  On Behalf
> > > Of Adrian Prantl via Dwarf-Discuss
> > > > On Mar 19, 2020, at 5:49 PM, Michael Eager via Dwarf-Discuss  > > disc...@lists.dwarfstd.org> wrote:
> > > > 
> > > > My reading of sections 2.5 & 2.6 is that you cannot have a DW_OP_piece
> > > in an DWARF expression.
> > > > 
> > > 
> > > I wonder if this is an intentional part of the design because of
> > > ambiguity/correctness issues or is this just something that happens to
> > > fall out of the way the text is worded? I can see how such a restriction
> > > might simplify DWARF consumers, but it also seems like an arbitrary
> > > restriction for which there may not be a technical reason.
> > 
> > My intuition (clearly I wasn't there at the time) is that this is like
> > a C expression being an rvalue (DWARF expression) or lvalue (location
> > description).  Values and locations aren't the same thing.
> 
> It is somewhat an L-value vs R-value issue.
> 
> You can craft a DWARF expression to extract a value (an R-value) from
> arbitrary memory locations or registers (for example, using DW_OP_and,
> DW_OP_sh?, etc.) and place it on the top of the stack.  A DW_OP_piece
> operator doesn't do this.  (There might be value in an operator which
> extracts a value from a composite location.)
> 
> A location (an L-value) which includes potentially multiple register or
> memory locations and multiple DW_OP_piece or DW_OP_bit_piece operations
> can't be evaluated by a simple stack-based expression interpreter.
> 
> The design is intentional, AFAIK, not accidental.
> 
> I think that the description has become a bit less clear with the addition
> of the Implicit Location Descriptions in Section 2.6.1.1.4, which do compute
> values, rather than locations.  Perhaps these should have been described in
> Section 2.5 as parts of a DWARF expression, not as parts of a Location
> Description.
> 
> The description (and implementation) of DWARF expressions and locations are
> somewhat muddled together.  This can be seen in the first sentence of
> Section 2.5:
>DWARF expressions describe how to compute a value or specify a
>location.
> A clearer definition would specify that the DWARF expression only computes a
> value, and leaving what that value means (e.g., register/memory contents,
> arbitrary computation, memory address) to the context in which the
> expression is used.  A more precise definition of a locati

Re: [Dwarf-Discuss] Segment selectors for Harvard architectures

2020-03-23 Thread Todd Allen via Dwarf-Discuss
Paul,

I haven't needed to contend with this issue.  But as I was looking over the
standard, this was my initial gut reaction too: use the segment selectors.  This
use actually does seem like it's a characteristic of the target architecture to
me.  You started the discussion with "Harvard architectures".

DWARF does permit architectures to specify aspects of their DWARF description,
after all.  I can't recall it ever being done *formally*, but it's been done
informally for every architecture that uses DWARF.  At a bare minimum, register
encodings.  And usually you have to root around in somebody else's source code
to find it.

This one has a slightly higher chance of breaking a consumer, if that consumer
was written not to tolerate the segment selectors.  But I think it would be fair
to put any such blame on the consumer in that case.  If the consumer doesn't die
with a SIGSEGV, then it might ignore the segments.  And then it would be no
worse off than now.

On Thu, Mar 19, 2020 at 06:05:16PM +, Dwarf Discussion wrote:
> This recently came up in the LLVM project.  Harvard architectures
> put code and data into separate address spaces, but those spaces
> are not explicit; instructions that load/store memory implicitly
> use the data space, while things like taking a function address or 
> doing indirect branches will implicitly use the code space.  This 
> doubles the effective size of memory without consuming an address 
> bit, as well as having other secondary benefits like not allowing
> self-modifying code.
> 
> Nearly all of the DWARF information does not need to distinguish
> between code and address spaces, because it's easy to derive that
> from context.  Addresses in the line table or a range list will be
> code addresses; in .debug_info, addresses of code elements will be
> code addresses, while variables will be data addresses. And so on.
> 
> This only seems to break down in the .debug_aranges section, which
> records both data and code addresses without any context to let a
> consumer know which is what.  In a flat-address architecture, no
> distinction is needed; in a segmented architecture, there will be
> a segment selector as part of any address, and that includes the
> .debug_aranges section.  What about for Harvard architectures?
> 
> What I suggested in the LLVM project is that .debug_aranges would
> have a 1-byte segment selector and use some trivial scheme such as
> 0=code, 1=data to distinguish what kind of address it is.  Other
> DWARF sections wouldn't need a selector because they can all use
> context to figure it out; this avoids the size overhead of using
> segment selectors everywhere else.
> 
> Pavel Labath pointed out that this seems inconsistent and might
> make consumers unhappy; segment selectors are described as a
> characteristic of the target architecture, so having them in one
> place and not others might look suspicious.  IMO it's a reasonable 
> "permissive" use of the existing DWARF structures, but it seemed
> worth asking here.
> 
> Does this (segment selector only in .debug_aranges) sound okay?
> Should there be non-normative text or a wiki description of this?
> Do we want to codify the 0=code 1=data use of segment selectors
> for all Harvard architectures (that don't otherwise have explicit
> segements) so that this doesn't have to be set by ABI committees?
> 
> I'm willing to write up whatever needs writing up, either as a
> proposal or as a wiki entry.
> 
> Thanks,
> --paulr
> 
> ___
> Dwarf-Discuss mailing list
> Dwarf-Discuss@lists.dwarfstd.org
> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org

-- 
Todd Allen
Concurrent Real-Time
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] modeling different address spaces

2020-07-16 Thread Todd Allen via Dwarf-Discuss
Markus, Michael, David, Xing,

I always assumed that the segment support in DWARF was meant to be more general,
and support architectures where there was no single flat memory, and so the
segments were necessary for memory accesses.  I personally have not dealt with
any architectures where DW_AT_segment came into play, though.

Certainly x86 does not fall into that "truly distinct segments" category, at
least not in modern times.  The segment registers there (fs & gs, for example)
are an indirect way of specifying a base address within the flat address space.
They usually end up being used for thread-specific data structures where each
thread has a different segment selector which implies a different base address.
And it requires a syscall to interact with the base addresses, at least on
Linux.  The other segment registers (cs, ds, ss) are set-and-forget by the OS
typically.

The CUDA architecture is an interesting case.  It doesn't use DW_AT_segment at
all.  But it does use the DW_AT_address_class attribute to specify CUDA segments
(e.g. Global, Local, Shared, among many others) for variables and/or types.  So
it's fairly fine-grained.  You can, for example, have a shared pointer to a
global pointer to a local integer, and the DW_AT_address_class attribute can
convey that.

Some of those CUDA segments are for radically different sorts of memory
(e.g. very low latency Shared memory vs. high latency Global memory).  But other
distinctions seem more gratuitous (e.g. Param vs. Global memory).  I assume that
there's a CUDA under-the-hood mapping of many of the segments to regions of a
flat Global address space in there, but the CUDA architectures & drivers
deliberately hide that mapping.  So effectively you end up with all the segments
being distinct, as far as a debugger can tell.

On Thu, Jul 16, 2020 at 09:23:51AM +, Dwarf Discussion wrote:
>Hello,
> 
> 
> 
>What would be the recommended way to model variables that are allocated to
>different address spaces?
> 
> 
> 
>I found DW_OPT_xderef for dereferencing address-space qualified pointers
>but the resulting memory location description wouldn't have an
>address-space qualifier.
> 
> 
> 
>I found DW_AT_address_class, which allows attaching an integer, which
>could represent the address-space.  This sounds pretty close.  I'm a bit
>thrown off by the example, though.
> 
> 
> 
>Thanks,
> 
>Markus.
> 

-- 
Todd Allen
Concurrent Real-Time
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] modeling different address spaces

2020-07-20 Thread Todd Allen via Dwarf-Discuss
Markus,

My experience with an architecture like this also is a GPU: the Nvidia CUDA
GPUs.  I don't work on nvcc.  I'm coming at this from the consumer side.  But
what I've observed:

They use DW_AT_address_class with a CUDA-specific enum of address spaces, with
values for things like: global memory, shared memory, const memory, etc.  They
don't attach these attributes to subroutines, because all the code on that
architecture is in a single "code" memory.  They do attach them to pointer
types, as the DWARF spec describes.  They also attach them to variables,
formals, etc.  That's a vendor extension (which I'd forgotten until I looked it
up again in the DWARF spec).  But an obvious one.  We might want to formalize it
at some point.

Anyway, these are the sorts of things we see:

   DW_TAG_pointer_type
  DW_AT_type  : ...
  DW_AT_address_class : ptxGenericStorage

   DW_TAG_variable
  DW_AT_name  : myConstant
  DW_AT_type  : ...
  DW_AT_location  : ...
  DW_AT_address_class : ptxConstStorage

   DW_TAG_variable
  DW_AT_abstract_origin : ...
  DW_AT_location: ...
  DW_AT_address_class   : ptxLocalStorage

I don't know your architecture, but I'd expect something similar to work for any
GPU with heterogeneous memories.

-- 
Todd Allen
Concurrent Real-Time

On Mon, Jul 20, 2020 at 08:31:53AM +, Dwarf Discussion wrote:
> Hello Michael,
> 
> > > What would be the recommended way to model variables that are allocated
> > > to different address spaces?
> > 
> > Can you describe the architecture a bit?
> 
> It's a GPU.  It uses a different address space for shared local memory.
> 
> 
> > > I found DW_OPT_xderef for dereferencing address-space qualified pointers
> > > but the resulting memory location description wouldn???t have an
> > > address-space qualifier.
> > 
> > DW_OPT_xderef translates from an architecturally defined memory
> > reference including an address space into a linear address space
> > (generic type).  DWARF doesn't support computations on address-space
> > qualified addresses, although using a typed stack, this could be an
> > extension.
> 
> I don't see a need for this, right now.  It should suffice to describe that an
> object lives in address-space A so the location expression yields an 
> A-address.
> 
> In another email you said: "
> CUDA address spaces or a DSP with multiple distinct address spaces are 
> what would conventionally be described as segmented memory.  I think 
> that using the DW_AT_address_space would be reasonable.
> ".
> 
> I assume you mean DW_AT_address_class.  This should do the trick.  I just 
> wasn't
> sure if that's the intended use of that attribute.
> 
> 
> > > I found DW_AT_address_class, which allows attaching an integer, which
> > > could represent the address-space.  This sounds pretty close.  I???m a bit
> > > thrown off by the example, though.
> > 
> > Which example?
> 
> Table 2.7 "Example address class codes" on p. 48.  It uses DW_AT_address_class
> to describe addressing modes.
> 
> Regards,
> Markus.
> Intel Deutschland GmbH
> Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
> Tel: +49 89 99 8853-0, www.intel.de
> Managing Directors: Christin Eisenschmid, Gary Kershaw
> Chairperson of the Supervisory Board: Nicole Lau
> Registered Office: Munich
> Commercial Register: Amtsgericht Muenchen HRB 186928
> ___
> Dwarf-Discuss mailing list
> Dwarf-Discuss@lists.dwarfstd.org
> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] string reduction techniques

2021-11-01 Thread Todd Allen via Dwarf-Discuss
Dave,

If I understand right: The space saving you're expecting is the near-elimination
of DW_AT_name strings.  If they are only simple names like "T" and "int", they
can be placed into the string table once each, and it should be very small.  But
you're expecting the DW_AT_linkage_name attributes still to have lots of
replication because of the large composed names.  So I gather that was where
your estimate of 1/2 reduction came from.

I was trying to figure out how we came to opposite conclusions, and I think it's
that I have this (implicit) assumption of a sort of "DWARF Moore's Law", that
the size of debug info/strings/etc. would double periodically, just based on the
tendency of software systems to grooow.  I'm likening it to Moore's Law,
because I expect it's the same sort of vague, rough estimate that somehow still
applies to the real world.

Assuming it does apply, your halving of the string table amounts to buying
yourself one doubling period, and then you're back to requiring DWARF64 string
tables.  (Meanwhile, DWARF64 gives us 32 doubling periods over DWARF32.  So
hopefully that will last us for a while...)

I can't be sure about this exponential growth.  I don't have the data to back it
up.  But I will say, when we created DWARF64, I was skeptical that it would be
needed during my career.  And yet here we are...

...

The reduction for DW_AT_linkage_name does seem like a tougher nut to crack.  As
you mentioned, there is a tendency to eliminate *some* of the replication
because of the mangler's use of substitution strings (S_, S0_, S1_, etc.)  But
that same feature probably would make it a lot harder to do anything clever
about chopping up the linkage names into substrings.

Honestly, I've never been sure why gcc generates DW_AT_linkage_name.  Our
debugger almost never uses it.  (There is one use to detect "GNU indirect"
functions.)  I wonder if it would be possible to avoid them if you provided
enough info about the template parameters, if the debugger had its own name
mangler.  I had to write one for our debugger a couple years ago, and it
definitely was a persnickety beast.  But doable with enough information.  Mind
you, I'm not sure there is enough information to do it perfectly with the state
of DWARF & gcc right now.

Todd

On Mon, Nov 01, 2021 at 01:06:33PM -0700, David Blaikie wrote:
>Hey Todd,
> 
>Just some details regarding the string reduction strategies I'm pursuing
>to address DWARF32 overflowing .debug_str.dwo/.debug_str_offsets.dwo
>sections in some large binaries at Google.
> 
>So the extreme cases I'm dealing with are predominantly C++ Expression
>templates (in TensorFlow and Eigen) - these produce types with very large
>DW_AT_names ("f1") and DW_AT_linkage_names (eg: "_Z2f1IiEvv") (but
>with many more template parameters, none of which are ever user-written
>but deduced).
> 
>So the main fix I'm pursuing (roughly called "simplified template names")
>is to omit template parameter lists from DW_AT_names of templates in most
>cases, allowing the consumer to reconstruct the name from
>DW_AT_template_*_parameters itself, recursively. Further discussion and
>details
>here: [1]https://groups.google.com/g/llvm-dev/c/ekLMllbLIZg/m/-dhJ0hO1AAAJ
>- in terms of how this affects scaling factors, it means that adding an
>additional template instantiation of existing types would add no new data
>to .debug_str (eg: going from a program with "t1" to "t1>"
>would add no new entries to .debug_str). Not all names can be readily
>reconstructed - so I'm opting the feature out on those, but we could have
>a more deeper discussion about how to handle them if we wanted to make
>this a full-fledged/robust feature (maybe one the DWARF spec
>suggests/encourages).
> 
>GDB seems to handle this sort of debug info OK - I guess someone did real
>work to support that at some point (so maybe some other debugger already
>generates DWARF like this).
> 
>The other half, though, is DW_AT_linkage_names - and in theory similar
>rebuilding could be done, but that'd require baking a lot fo
>implementation knowledge into the DWARF Consumer that DWARF is meant to
>help avoid... so I'm unsure what the right solution is there just now, but
>there's a few ideas I'm still kicking around. At least linkage names have
>less redundancy (within a single name they avoid redundancy - "t1,
>t1>" only ends up with a single description of "t1" instead of
>two of them like you get with the DW_AT_name) than DW_AT_names, so they do
>scale a bit better already.
> 
>Happy to discuss these ideas in specific, or their impact on debug_str
>growth in more detail any time (here, video chat, discords, etc).
> 
>- Dave
> 
> References
> 
>Visible links
>1. https://groups.google.com/g/llvm-dev/c/ekLMllbLIZg/m/-dhJ0hO1AAAJ
___
Dwarf-Discuss mail

Re: [Dwarf-Discuss] string reduction techniques

2021-11-07 Thread Todd Allen via Dwarf-Discuss
7;s not got a bit of
>  "settling in" to do. And I'm still rather hopeful we might be able
>  to reduce the overheads enough to avoid widespread use of DWARF64 -
>  but it's not a sure thing by any means.
> 
>  Agreed. I'd like to explore as many avenues as we can to eliminate
>  the
>  need for DWARF64.
> 
>  >> Honestly, I've never been sure why gcc generates
>  DW_AT_linkage_name.  Our
>  >> debugger almost never uses it.  (There is one use to detect "GNU
>  indirect"
>  >> functions.)  I wonder if it would be possible to avoid them if
>  you provided
>  >> enough info about the template parameters, if the debugger had
>  its own name
>  >> mangler.  I had to write one for our debugger a couple years ago,
>  and it
>  >> definitely was a persnickety beast.  But doable with enough
>  information.  Mind
>  >> you, I'm not sure there is enough information to do it perfectly
>  with the state
>  >> of DWARF & gcc right now.
>  >
>  > Yeah, that was/is certainly my first pass - the way I've done the
>  DW_AT_name one is to have a feature in clang that produces the short
>  name "t1" but then also embeds the template argument list in the
>  name (like this: "_STNt1|") - then llvm-dwarfdump will detect
>  this prefix, split up the name, rebuild the original name as it
>  would if it'd been given only the simple name ("t1") and compare it
>  to the one from clang. Then I can run this over large programs and
>  check everything round-trips correctly & in clang, classify any
>  names we can't roundtrip so they get emitted in full rather than
>  shortened.
>  > We could do something similar with linkage names - since to know
>  there's some prior art in your work there.
>  >
>  > I wouldn't be averse to considering what'd take to make DWARF
>  robust enough to always roundtrip simple and linkage names in this
>  way - I don't think it'd take a /lot/ of extra DWARF content.
> 
>  Fuzzy memory here, but as I recall, GCC didn't generate linkage
>  names
>  (or only did in some very specific cases) until the LTO folks
>  convinced us they needed it in order to relate profile data back to
>  the source. Perhaps if we came up with a better way of doing that,
>  we
>  could eliminate the linkage names.
> 
>No, see, that's a mildly reasonable answer.
>If you go far enough back, the linkage names exist for a few reasons:
>1. Because the debug info wasn't always good enough, and so GDB used
>to demangle the linkage names and parse them using a hacked up C++-ish
>parser for type info.
>2. Even when it didn't, it decoded linkage names to detect things like
>destructors/constructors, etc.
>3. Because It used it to do remangling properly and try to generate
>method signatures to lookup (and for #1)
>4. Because it was used to do symbol lookup of in the ELF/etc symbol
>tables for static things/etc.
>5. Because it saved space in STABS to do #1 (they predate DWARF by
>far).
>If you checkout gdb source code, circa 2001, and search for things
>like check_stub_method, and follow all the things it calls (like
>gdb_mangle_name), you can learn the history of linkage names (and
>probably throw up in your mouth a little).
> If you do a case insensitive search for things like "physname" and
>"phys_name", you'll see all the places it used to use the linkage
>names.
>I spent a lot of time abstracting out things like the
>constructor/destructor name testing, vptr name finding, etc, so that
>someone later might have a chance to get rid of linkage names (it was
>also necessary because of the gcc 2.95->3.0 ABI change).
> 
>___
>Dwarf-Discuss mailing list
>[5]Dwarf-Discuss@lists.dwarfstd.org
>[6]http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
> 
>  ___
>  Dwarf-Discuss mailing list
>  [7]Dwarf-Discuss@lists.dwarfstd.org
>  [8]http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
> 
> References
> 
>Visible links
>1. mailto:dwarf-discuss@lists.dwarfstd.org
>2. https://godbolt.org/z/TqYjeevqx
>3. mailto:dwarf-discuss@lists.dwarfstd.org
>4. mailto:dwarf-discuss@lists.dwarfstd.org
>5. mailto:Dwarf-Discuss@lists.dwarfstd.org
>6. http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
>7. mailto:Dwarf-Discuss@lists.dwarfstd.org
>8. http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org

> ___
> Dwarf-Discuss mailing list
> Dwarf-Discuss@lists.dwarfstd.org
> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


-- 
Todd Allen
Concurrent Real-Time
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] [EXTERNAL] - RE: Multiple floating point types with the same size but different encodings

2022-01-25 Thread Todd Allen via Dwarf-Discuss
> 
> For ATE codes, the problem is with standardization if we wanted
> to standardize it in some way for DWARF6.
> The current DW_ATE_{,complex_}float is way too unspecific and historically
> can be about various formats.
> So, we'd need something like DW_ATE_{,complex_}ieee754_float
> (or ieee754_binary_float?)
> which would depending on DW_AT_byte_size be binary{16,32,64,128,256}
> format, and then add DW_ATE_* values for the floating point formats
> known to us, which would be
> vax_f_float, vax_g_float, vax_d_float (what about vax_h_float?)
> bfloat16
> Intel extended precision
> IBM extended (double double)
> what else?
> Anyway, it might be possible as can be seen in the DW_ATE_HP_*
> extensions to reuse the same DW_ATE_* code for multiple different
> formats as long as they are guaranteed to have different DW_AT_byte_size.
> 
> For DW_AT_precision/DW_AT_min_exponent/DW_AT_max_exponent we would
> just define them the same way as C/C++ does define corresponding
> macros, e.g. https://en.cppreference.com/w/c/types/limits
> (though of course, we can only reasonably use properties that are
> expressible as small integral values or booleans, can't have
> attributes matching to say FLT_MAX etc., which need some floating point
> values).  All could be optional and the producers would need to use them
> only if without those attributes it would be ambiguous what it is.
> 

I suspect you'd end up needing more attributes to completely pin down a
floating-point type.  Consider the i86/i87 FPU 80-bit floating-point type.  It
had a single bit which was the integer part, in additional to the traditional
fractional bits of the mantissa.  So determining the number of bits in the
exponent is not simply (bit_size - mantissa_bits - sign_bit).  Also, you'd need
to indicate lack of support for inf/nan, unless you were expecting that to be
deduced from the min_exponent/max_exponent.

The encodings do seem like a cleaner approach, as Ron argued: You either
recognize the enumerated value, probably because your hardware supports it
natively, and you don't care all that much about all the persnickety bits; or
you reject the type.  I suppose you might choose to support a non-native
floating-point type, but I suspect you'd need to have a priori knowledge of all
the details anyway.

If we do end up promoting any of these architecture-specific types into the
standard, perhaps we should provide some of the implementation details about
what they mean.  We could do that by referencing other documents, but it seems
reasonable to include a table containing the number of bits for each field, and
a mention of any major peculiarities (e.g. skewed bias, inf/nan unsupported,
etc.).

-- 
Todd Allen
Concurrent Real-Time
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] EXTERNAL: Corner-cases with bitfields

2022-05-06 Thread Todd Allen via Dwarf-Discuss
> 
> Dear all,
> 
> During our work on debugging support of compute workloads on AMDGPU[1],
> we (at AMD) have been seeing two cases regarding description of
> bitfields in DWARF for which we do not find definitive answers in the
> DWARF documentation.  For those cases, when experiencing with usual CPU
> targets we observe different behaviors on different toolchains.  As a
> consequence, we would like to discuss those points here to gather
> feedbacks.  If deemed necessary, we will submit a formal clarification
> issue to the dwarf committee.
> 
> Both of the cases we present below impact how arguments are passed
> during function calls in the ABI for at least our target (AMDGPU).
> However, the debug information available to the debugger does not give
> enough information to decide how to handle the type and the spec does
> not really say what debug information should be generated to properly
> describe those cases.  Also note that in both case, the DWARF
> information we have is sufficient to describe the memory layout of the
> types.
> 
> 1 - Bitfield member with a size matching its underlying type:
> 
> The first point we would like to discuss is the one of  bitfield members
> whose sizes match their underlying type.  Let's consider the following
> example:
> 
>  struct Foo
>  {
>char a :???8;
>char b : 8;
>  };
> 
> If we compile such example with GCC it will add the `DW_AT_bit_size` and
> `DW_AT_bit_offset` attributes to the `a` and `b` DIEs.
> 
> Clang on the other hand will not produce those attributes.
> 
> On the debugger side, GDB currently considers a struct member as being
> packed (i.e. part of a bitfield) if it has the `DW_AT_bit_size`
> attribute present and is non-0.  Therefore, GDB will "understand"
> what GCC produces, but not what Clang produces.
> 
> What Clang does seems to be a reasonable thing to do if one is only
> interested in the memory layout of the type.  This however is not
> sufficient in our case to decide how to handle such type when
> placing/inspecting arguments in registers in the context of function
> calls. In our ABI, bitfield members are passed packed together, while
> two chars in a struct would be placed in separate registers.
> 
> To clarify this situation, it would be helpful that a producer always
> includes the DW_AT_bit_size attribute for bit field, which the standard
> does not suggest nor require.
> 

It sounds like your ABI is basing its decision on a boolean: is the field a bit
field or not.  And you're trying to deduce this from DW_AT_bit_offset.  Perhaps
a better solution would be to make this explicit in the DWARF, some new
DW_AT_bitfield flag.  There's very little that the DWARF standard can do to
mandate such an attribute.  (Permissive standard yadda yadda.)  But if it's
necessary for debuggers to work correctly in a given ABI, compilers should be
well-motivated to produce it when generating code for that ABI.

> 2 - Unnamed zero sized bitfield
> 
> Another case we met is related to unnamed fields with a size of 0 bits.
> Such field is defined in the c++ standard as (in 9.6 Bit-Fields):
> 
>  > As a special case, an unnamed bit-field with a width of zero
>  > specifies alignment of the next bit-field at an allocation unit
>  > boundary
> 
> If we now consider an extended version of our previous example:
> 
>  struct Foo
>  {
>char a : 8;
>char : 0;
>char b :???8,
>  };
> 
> Neither GCC nor Clang give any information about the unnamed bitfield.
> As for the previous case, the presence of such field impacts how
> arguments are passed during function calls on our target, leaving the
> debugger unable to properly decide how to handle such cases.
> 
> As for the previous case, both compilers can properly describe Foo's
> memory layout using DW_AT_bit_offset.
> 
> It seems that such 0-sized field also has impact on ABI on other targets
> such as armv7hl and aarch64 as discussed in [2].  Should the DWARF
> specification give some guidance on how to handle such situation?
> 
> All thoughts on those cases are welcome.
> 

I'm agreeing with Michael that describing the unnamed bitfield seems dubious.
If it does impact the ABI, I'm wondering if that impact is indirect: that is,
the presence of this 0-width bit field changes an attribute of the next field,
and that attribute is responsible for difference in the behavior.  If so, is
there any way other than a 0-width bit field to cause the same behavior?  This
might be another case where describing the attribute that's directly responsible
might be better.

-- 
Todd Allen
Concurrent Real-Time
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] EXTERNAL: Corner-cases with bitfields

2022-05-09 Thread Todd Allen via Dwarf-Discuss
On Mon, May 09, 2022 at 09:41:03PM +, Dwarf Discussion wrote:
> > Pedro Alves wrote:
> > On 2022-05-09 16:48, Ron Brender via Dwarf-Discuss wrote:
> > > So my suggestion is to file a bug report with CLANG, requesting they
> > correct their DWARF output to reflect all details needed
> > > by your language.
> > 
> > An issue here is that DWARF does say this, in (DWARF 5, 5.7.6 Data Member
> > Entries, page 119):
> > 
> >  "If the size of a data member is not the same as the size of the type
> > given for the
> > 
> > ^^
> > ^^^
> >  data member, the data member has either a DW_AT_byte_size or a
> >  ^^^
> >  DW_AT_bit_size attribute whose integer constant value (see Section 2.19
> > on
> >  page 55) is the amount of storage needed to hold the value of the data
> > member."
> > 
> > Note the part I underlined.  In Lancelot's case, the size of the data
> > member
> > IS the same as the size of the type given for the data member.  So Clang
> > could well pedantically
> > claim that they _are_ following the spec.  Shouldn't the spec be clarified
> > here?
> 
> What the spec says is that a producer isn't _required_ to emit the
> DW_AT_bit_size attribute.  But, given that DWARF is a permissive
> standard, the producer is certainly _allowed_ to emit the attribute.  
> If this is a hint that the target debugger will understand, regarding
> the ABI, it seems okay to me for the producer to do that.
> 
> > This then raises the question of whether a debugger can assume that the
> > presence of a DW_AT_bit_size
> > attribute indicates that the field is a bit field at the C/C++ source
> > level.  GDB is assuming that
> > today, as there's really no other way to tell, but I don't think the spec
> > explicitly says so.
> 
> GDB is choosing to make that interpretation, which it's allowed to do.
> The DWARF spec just doesn't promise that interpretation is correct.
> 
> You can propose to standardize that interpretation by filing an issue
> with the DWARF committee at https://dwarfstd.org/Comment.php and it might
> or might not become part of DWARF v6.  It might be tricky because you'd
> be generalizing something very specific to your environment.
> 
> You can also, separately, try to get Clang to emit the DW_AT_bit_size
> attribute in these cases for the AMDGPU target(s).  This seems more
> likely to work, especially as there's an ABI requirement involved, and
> (given that GDB makes this interpretation) I assume gcc already does this.
> 

Lancelot,

I suppose, if you didn't want to submit an issue, another solution would be to
require the necessary tags & attributes in the ABI itself.  We already expect
ABI documents to provide things like register values, CFI initial values, and
some more esoteric stuff (augmentations, non-standard endianity & isa).  An ABI
that required descriptions in ABI-specific situations like these two seems
reasonable to me.  And it places no burden on compilers for other ABI's.

-- 
Todd Allen
Concurrent Real-Time
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] EXTERNAL: Corner-cases with bitfields

2022-05-09 Thread Todd Allen via Dwarf-Discuss
On Mon, May 09, 2022 at 04:09:59PM -0700, Michael Eager wrote:
> On 5/9/22 16:00, Todd Allen via Dwarf-Discuss wrote:
> > I suppose, if you didn't want to submit an issue, another solution would be 
> > to
> > require the necessary tags & attributes in the ABI itself.  We already 
> > expect
> > ABI documents to provide things like register values, CFI initial values, 
> > and
> > some more esoteric stuff (augmentations, non-standard endianity & isa).  An 
> > ABI
> > that required descriptions in ABI-specific situations like these two seems
> > reasonable to me.  And it places no burden on compilers for other ABI's.
> 
> This creates the situation where there are two definitions for a DWARF
> attribute, one in an ABI and a different one in the DWARF Spec.  We want to
> avoid situations where one producer says "I'm following DWARF" and another
> "I'm following the ABI".  That makes interoperability difficult.
> 
> The information you mention in an ABI is not in the DWARF Spec.
> 

I don't know that it's quite that bad.  The ABI could say that DW_AT_bit_size
*also* implies that the field is a bit field for ABI purposes.  That doesn't
change the meaning from the DWARF specification; it merely adds to it.  Mind
you, I think an explicit DW_AT_bit_field (or something like that) is better.

Also, while the DWARF standard is intentionally permissive, an ABI need not be.
They could mandate either of the above solutions, and also mandate descriptions
of anonymous 0-sized fields.  (Unless there's a better, more direct
description for that case.)

-- 
Todd Allen
Concurrent Real-Time
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-discuss] EXTERNAL: Re: ISSUE: tensor types. V3

2023-04-21 Thread Todd Allen via Dwarf-discuss
I've been playing catch-up on this discussion today.  I was convinced of the
value early on just based on the need of this information to follow the ABI
parameter passing rules on certain architectures.  And I was with you right
up until this V3 version.  Comments below:

On Thu, Apr 13, 2023 at 11:57:08AM -0700, Dwarf Discussion wrote:
>I didn't put back any changes that would allow these tensor types to
>appear on the DWARF stack. I feel that particular topic hasn't been
>settled yet. The general plan is I will work with Jakub and create some
>cases where a compiler could want to put these vector types on the DWARF
>stack. Tony Tye and the AMD team believe that the vector types do not need
>to be on the stack and believe that all the cases where the debuggers
>would want to access elements within the vector can be addressed with
>offsetting. IIUC a key point seems to be that they have never seen a case
>where an induction variable was embedded in a slot in a vector register,
>it always is a scalar. (I am not sure that I fully grokked their argument
>-- so please correct me) In the cases where it was, it could still be
>accessed as an implicit. Once I've got some examples of how a debugger
>might want to put vector types on the DWARF stack, the AMD team can
>suggest alternative approaches. I said that I would make a V4 proposal if
>the group ultimately comes to a consensus that vector registers are in
>fact needed on the stack.

A proposal to allow vector types on the DWARF expression stack easily could be
a distinct proposal, although it obvious would have a dependency on this one.
This seems like a good application of the "keep proposals small" philosophy.

>Insert the following paragraph between the first paragraph of
>normative text describing DW_TAG_array_type and the second paragraph
>dealing with multidimensional ordering.
> 
>
>An array type that refers to a vector or matrix type, shall be
>denoted with DW_AT_tensor whose integer constant, will specify the
>kind of tensor it is. The default type of tensor shall be the kind
>used by the vector registers in the target architecture.
> 
>Table 5.4: Tensor attribute values
>--
>Name  | Meaning
>--
>DW_TENSOR_default | Default encoding and semantics used by target
>  | architecture's vector registers
>DW_TENSOR_boolean | Boolean vectors map to vector mask registers.
>DW_TENSOR_opencl  | OpenCL vector encoding and semantics
>DW_TENSOR_neon| NEON vector encoding and semantics
>DW_TENSOR_sve | SVE vector encoding and semantics
>--

As someone who was not sitting in on your debugging GPUs discussions, this table
is baffling.  Is it based on the "Vector Operations" table on the clang
LanguageExtensions page you mentioned?  That page is a wall of text, so I might
have missed another table, but these values are a subset of columns from that
table.

1 of the values here is a source language (opencl), 2 reflect specific vector
registers of one specific architecture (neon & sve), and I don't even know what
boolean is meant to be.  Maybe a type that you would associate with predicate
registers?  I think this table needs a lot more explanation.

How do you envision debuggers using this information?  Merely disallowing things
like operator++, or disallowing casts, or certain flavors of casts?  (Those were
the differences I spotted in that table.)  This doesn't seem terribly
compelling.  But if others think it is, maybe this should be broken up into
distinct features instead of a lumpy enum?

-- 
Todd Allen
Concurrent Real-Time
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] ISSUE: tensor types. V3

2023-04-24 Thread Todd Allen via Dwarf-discuss
On 4/21/23 16:31, Ben Woodard via Dwarf-discuss wrote:
>>
>>>     Insert the following paragraph between the first paragraph of
>>>     normative text describing DW_TAG_array_type and the second 
>>> paragraph
>>>     dealing with multidimensional ordering.
>>>
>>> 
>>>     An array type that refers to a vector or matrix type, shall be
>>>     denoted with DW_AT_tensor whose integer constant, will 
>>> specify the
>>>     kind of tensor it is. The default type of tensor shall be 
>>> the kind
>>>     used by the vector registers in the target architecture.
>>>
>>>     Table 5.4: Tensor attribute values
>>> --
>>>     Name  | Meaning
>>> --
>>>     DW_TENSOR_default | Default encoding and semantics used by 
>>> target
>>>   | architecture's vector registers
>>>     DW_TENSOR_boolean | Boolean vectors map to vector mask 
>>> registers.
>>>     DW_TENSOR_opencl  | OpenCL vector encoding and semantics
>>>     DW_TENSOR_neon    | NEON vector encoding and semantics
>>>     DW_TENSOR_sve | SVE vector encoding and semantics
>>> --
>> As someone who was not sitting in on your debugging GPUs discussions, 
>> this table
>> is baffling.  Is it based on the "Vector Operations" table on the clang
>> LanguageExtensions page you mentioned?
> Yes
>> That page is a wall of text, so I might
>> have missed another table, but these values are a subset of columns 
>> from that
>> table.
>>
>> 1 of the values here is a source language (opencl), 2 reflect 
>> specific vector
>> registers of one specific architecture (neon & sve), and I don't even 
>> know what
>> boolean is meant to be.  Maybe a type that you would associate with 
>> predicate
>> registers?  I think this table needs a lot more explanation.
>
> This was something that Pedro pointed out and it was something that I
> hadn't thought of. The overall justification for this is that these
> types were semantically different than normal C arrays in several
> distinct ways. There is this table which explains the differences:
> https://clang.llvm.org/docs/LanguageExtensions.html#vector-operations
> The argument is that the semantics of different flavors are different
> enough that they need to be distinct.
>
> I really do not know much of anything about OpenCL style vectors, I
> wouldn't at all be against folding that constant in because it is
> something that could be inferred from the source language. I left it in
> because I thought that there might exist in cases where clang compiles
> some OpenCL code that references some intrinsics written in another
> language like C/C++ which depends on the semantics of OpenCL vector 
> types.
>
> NEON, yeah I think we should drop that one. The current GCC semantics
> are really Intel's vector semantics. By changing it from "GCC semantics"
> to "Default encoding and semantics used by target architecture's vector
> registers" I think we eliminate the need for that.
>
> You are correct boolean is for predicate register types. After looking
> at the calling conventions, these are not passed as types themselves. So
> for the purpose of this submission, I don't think we need it. I believe
> that some of the stuff that Tony and the AMD, and intel guys are almost
> ready to submit has DWARF examples of how to make use of predicate
> registers in SIMD and SIMT and access variables making use of predicate
> registers should be sufficient for those.
>
> ARM SVE and RISC-V RVV are really weird because of those HW
> implementation defined vs architecturally defined register and therefore
> type widths. It has been a couple of compiler generation iterations
> since I looked at the DWARF for those but but when I last looked, the
> compilers didn't know what to do with those and so they didn't generate
> usable DWARF. So I feel like there are additional unsolved problems with
> the SVE and RVV types that will need to be addressed. It is a problem,
> that I know that I need to look into -- but right now I do not have any
> "quality of DWARF" user issues pulling it closer to the top of my
> priority list. The only processor I've seen with SVE is the A64FX used
> in Fugaku and the HPE Apollo 80's, the Apple M1 and M2 don't have it and
> I haven't seen any of the newer ARM enterprise CPUs. I don't think there
> are any chips with RVV yet. Once more users have access to hardware that
> supports it, I know that it will be more of a problem. I kind of feel
> like that will be a whole submission in and of itself.
>
>
So you're thinking that "OpenCL vector semantics" ought to be 
determinable from DW_AT_language DW_LANG_OpenCL?  Seems reasonable.

DW_TENSOR_boolean: Could it just be determinable from the shape of the 
arra

Re: [Dwarf-discuss] ISSUE: tensor types. V3

2023-04-24 Thread Todd Allen via Dwarf-discuss
On 4/24/23 13:27, Ben Woodard via Dwarf-discuss wrote:
>
>> As for NEON vs. SVE, is there a need to differentiate them?  And can it
>> not be done by shape of the type?
>
> That one continues to be hard. ARM processors that support SVE also have
> NEON registers which like the Intel SSE MMX AVX kind of vector registers
> are architecturally specified as having a specific number of bits.
> Handling those are trivial.
>
> The weird thing about SVE registers (and the same things also apply to
> RVV) are that the number of bits is not architecturally defined and is
> therefore unknown at compile time. The size of the registers can even
> vary from hardware implementation to hardware implementation. So a
> simple processor may only have a 128b wide SVE register while a monster
> performance core may have 2048b wide SVE registers. The predicate
> registers scale the same way. I that it can even vary from core to core
> within a CPU sort of like intel's P-cores vs E-cores. To be able to even
> know how much a loop is vectorized you need to read a core specific
> register that specifies how wide the vector registers are on this
> particular core. Things like induction variables are incremented by the
> constant in that core specific register divided by size of the type
> being acted upon. So some of the techniques used to select lanes in
> DWARF don't quite work the same way.
>
> Just to make things even more difficult, when one of these registers are
> spilled to memory like the stack the size is unknown at compile time and
> so any subsequent spilling has to determine the size that it takes up.
> So any subsequent offsets need to use DWARF expressions to that
> reference the width of the vector.
>
> ...and then there is SME which is like SVE but they are matrices rather
> than vectors. The mind boggles.
>
So the variability of the vector size is the only significant difference 
that you've identified?  If so, then I think the shape of the array type 
probably is sufficient.  For SVE, the DW_TAG_subrange_type will have a 
DW_AT_upper_bound which is a variable (reference or dwarf expr), or the 
DW_TAG_array_type's DW_AT_{byte,bit}_size will be a variable, or both.  
Meanwhile, NEON would use DW_AT_bit_size 128 (or DW_AT_byte_size 16) and 
a constant DW_AT_upper_bound (128/bitsizeof(elementtype)).  That seems 
like it very directly reflects the difference between the two vector types.

>> If all those things You argued that it still should be an enum, but 
>> with only one "default"
>> value defined.  And I guess any other values that might be added later
>> would be (or at least start as) vendor extensions. It's peculiar, and I
>> don't think we have that anywhere else in the standard.
> I guess that my point is that I'm fairly certain that SVE and RVV will
> need special handling and when the compilers start handling the matrix
> types that the hardware is starting to support, they are going need some
> help as well.
If there's something more peculiar about the types inhabiting these 
vector registers than "variable size", that might convince me.  But 
merely being variable-sized doesn't.
>> If it ever became necessary, you can always add a 2nd attribute for it.
>> As an example, in our Ada compiler decades ago, we did this for
>> DW_AT_artificial.  It's just a flag, so either present or not-present.
>> We added a 2nd DW_AT_artificial_kind with a whole bunch of different
>> enums for the various kinds our compiler generated.  The point is you
>> still can get there even if DW_AT_tensor is just a flag.
>
> Totally, not opposed to that if that is the way that people want to
> handle it. My only (admittedly weak) argument against doing it that way
> is that there there will now be two attributes rather than one and the
> space that it takes up. John DelSignore was just dealing with a program
> that had 4.9GB of DWARF, it would be nice to keep it as compact as
> possible. Of course most of that is likely location lists and template
> instantiations and stuff like that not the relatively rare case like
> this. The cases where this shows up are likely going to be fairly rare.
>
> Would this be an acceptable compromise for V4 of my proposal? I drop it
> back to just being a flag for the time being. Then in a subsequent
> submission (which may or may not be in the DWARF6 cycle -- but hopefully
> is in time for DWARF6), if I find it necessary to make a flavor to
> support SVE, RVV or SME, then my submission for that will include
> changing DW_AT_tensor to requiring a constant that then references an
> enum like I did above. If it comes out before DWARF6 is released then
> great, we don't have to redefine anything. If It bumped to DWARF7 then
> we add a _kind attribute.

You can submit it in whichever form you prefer.  I supposed you were 
soliciting comments here to get it in a form as close to acceptable as 
possible before submitting it.  After you do, the committee will discuss 
it, probably ad nauseum.  (And I'll be n

Re: [Dwarf-discuss] EXTERNAL: Re: Enhancement: DWARF Extension Registry

2023-12-04 Thread Todd Allen via Dwarf-discuss
istorical counterargument: In the DWARF 2 spec, even 
though it was a radical departure from the DWARF 1 spec, some tag & attribute 
values from DWARF 1 were reserved in DWARF 2 just to avoid confusion (e.g. tags 
0x06, 0x07, 0x09, 0x0c, 0x0e).  So that was a choice made even for the 
spec-defined values.  This is a bit apples & oranges, but I think it's 
interesting that the thinking back then was the exact opposite: never reuse 
values.

Todd Allen
Concurrent Real-Time

-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] EXTERNAL: Re: Proposal/clarification: "inherited" subrange bounds

2024-07-31 Thread Todd Allen via Dwarf-discuss
Alexandre,

Since you mention that this for Ada, I'll mention this:

In our old Ada implementation, we make a decision not to use
DW_TAG_subrange_type as a general-purpose solution for Ada; we used it
only for array bounds, as described when it's introduced in section 5.5
Array Type Entries.  (Well, we were using the DWARF 2 spec, but you get
the idea.)

For Ada derived types & subtypes, we created something new:
DW_TAG_derived_type and DW_TAG_subtype. We had several reasons for doing
this.  But one benefit was that, for packed types, we used anonymous
DW_TAG_subtype DIE's instead of DW_TAG_subrange_type DIE's.  They had
DW_AT_subtype_parent attributes referencing back to the types on which
they were based, and we had the freedom to say that they inherited all
aspects from the their "subtype parent", unless overridden.  Overridings
included size, encoding, and each of the bounds, individually.

A DW_TAG_subtype could be reused for all packings of a particular type,
if the compiler was sure they all were identical.  But, of course, it's
possible to "quasi-pack" types using representation clauses, so it's
entirely possible to need more than 2 different representations.

FWIW, we didn't use DW_TAG_subtype for subtypes *initially*.  It was
only after attempting to use DW_TAG_subrange_type for a while, and
finding it unsuitable that we created the new tag.  And it was cleaner
to have a new tag that none of the DW_TAG_subrange_type array-specific
baggage.

Todd

On 7/30/24 19:21, Alexandre Oliva via Dwarf-discuss wrote:
> CAUTION! External Email. Do not click links or open attachments unless you 
> recognize the sender and are sure the content is safe.
> If you think this is a phishing email, please report it by using the "Phish 
> Alert Report" button in Outlook.
>
> On Jul 29, 2024, David Blaikie  wrote:
>
>>> The situation is not very different, but in Ada one can specify the
>>> target size (in bits) for the type (which may require biased
>>> representations, but that's besides the point).  Despite the specified
>>> size, standalone variables and members of unpacked types use full
>>> storage units, unless packing is requested.  See
>>> e.g.
>>> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/testsuite/gnat.dg/bias1.adb
>> I see, so do I understand correctly that you'd prefer not to use the
>> bitfield style representation, because it'd be repetitious?
> There's that (Dwarf aims for compactness), but there's also the fact
> that the type size is explicitly specified as the smaller bit size, so a
> proper representation of that type would carry that piece of
> information.  ISTM that ideally the larger, full-unit-sized variant
> would be the one using explicit sizes or a separate type variant
> inheriting the same bounds.
>
> But the problem I see, and try to raise in this thread, is that there's
> no way for a subrange type to inherit bounds from another subrange type,
> which once again plays against compactness.
>
> --
> Alexandre Oliva, happy hackerhttps://fsfla.org/blogs/lxo/
> Free Software Activist   GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice but
> very few check the facts.  Think Assange & Stallman.  The empires strike back
> --
> Dwarf-discuss mailing list
> Dwarf-discuss@lists.dwarfstd.org
> https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Proposal: Add support for "property" with getter/setter (based on Pascal properties)

2024-10-10 Thread Todd Allen via Dwarf-discuss
On 9/30/24 11:30, Adrian Prantl via Dwarf-discuss wrote:
> ## Proposed Changes
>
> ### Table 2.1
> add `DW_TAG_property`.
>
> ### 5.7.6 add the following normative text
>
>  `DW_TAG_property`
>
> Non-normative: Many object-oriented languages like Pascal and Objective-C 
> have properties, which are member functions that syntactically behave like 
> data members of an object. Pascal can also have global properties.
>
> A property is represented by a debugging information entry with the
> tag `DW_TAG_property`. At property entry has a `DW_AT_name` string
> attribute whose value is the property name. A property entry has a
> `DW_AT_type` attribute to denote the type of that property.
>
> A property may have `DW_AT_Accessibility`, `DW_AT_external`, 
> `DW_AT_virtuality`, `DW_AT_start_scope`, `DW_AT_decl_column`, 
> `DW_AT_decl_file` and `DW_AT_decl_line` attributes with the respective 
> semantics described for these attributes for `DW_TAG_member` (see chaper 
> 5.7.6).
>
> A property may have one or several of `DW_TAG_property_getter`, 
> `DW_TAG_property_setter`, or `DW_TAG_property_stored` children to represent 
> the getter and setter (member) functions, or the Pascal-style `stored` 
> accessor for this property. Each of these tags have a `DW_AT_specification` 
> attribute to point to a (member) function declaration. They may also have 
> `DW_TAG_formal_parameter` children that can have `DW_AT_default_value` 
> attributes to declare additional default arguments for when these functions 
> are used as property accessors.
> Some languages can automatically derive accessors for properties from a field 
> in the object. In such cases the `DW_AT_specification` attribute of the 
> accessor entry may point to the `DW_TAG_member` entry of the field that holds 
> the properties underlying storage.

Adrian,

This usage of DW_AT_specification seems very different from other uses 
of DW_AT_specification, where it's indicating that the current DIE is a 
completion of a forward declaration at the referenced DIE.  This usage 
is a bit closer to a "renames", but it isn't even quite that.  I suggest 
not attempting to use DW_AT_specification for this case, and just 
inventing a new attribute.

Todd
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Proposal: Add support for "property" with getter/setter (based on Pascal properties)

2024-10-14 Thread Todd Allen via Dwarf-discuss
You always can just make a similar statement for the new attribute: It 
inherits attributes from the referenced DIE unless overridden in this 
DIE.  Or somesuch.

On 10/12/24 17:10, Adrian Prantl wrote:
> CAUTION! External Email. Do not click links or open attachments unless you 
> recognize the sender and are sure the content is safe.
> If you think this is a phishing email, please report it by using the "Phish 
> Alert Report" button in Outlook.
>
>> On Oct 10, 2024, at 9:31 AM, Todd Allen via Dwarf-discuss 
>>  wrote:
>>
>> Adrian,
>>
>> This usage of DW_AT_specification seems very different from other uses
>> of DW_AT_specification, where it's indicating that the current DIE is a
>> completion of a forward declaration at the referenced DIE.  This usage
>> is a bit closer to a "renames", but it isn't even quite that.  I suggest
>> not attempting to use DW_AT_specification for this case, and just
>> inventing a new attribute.
>>
>> Todd
>  From the spec:
>
>> A debugging information entry that represents a declaration that completes
>> another (earlier) non-defining declaration may have a DW_AT_specification
>> attribute whose value is a reference to the debugging information entry
>> representing the non-defining declaration.
> You are right that this doesn't fit because in the property use-case we don't 
> complete a forward declaration, we just refer to it.
>
>> A debugging information entry with a
>> DW_AT_specification attribute does not need to duplicate information provided
>> by the debugging information entry referenced by that specification 
>> attribute.
> This behavior is something I want though.
>
> I'll replace it with a new attribute for now.
>
> thanks,
> Adrian


-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Proposal: Add support for "property" with getter/setter (based on Pascal properties)

2024-10-14 Thread Todd Allen via Dwarf-discuss
Perhaps you could use an attribute with the full "path" as a string.
I'm not sure it's high-value information, perhaps only useful for some
sort of "info type" command that wants to describe the type.

On 10/12/24 16:59, Adrian Prantl via Dwarf-discuss wrote:
> CAUTION! External Email. Do not click links or open attachments unless you 
> recognize the sender and are sure the content is safe.
> If you think this is a phishing email, please report it by using the "Phish 
> Alert Report" button in Outlook.
>
> On Sep 30, 2024, at 1:39 PM, Martin  wrote:
>> On 30/09/2024 19:30, Adrian Prantl wrote:
>>> PS: One thing I left out is DW_AT_Property_Object. It wasn't clear to me 
>>> why this wouldn't always be the address of the parent object of the 
>>> DW_TAG_property.
>> There is a construct where a property can point to an embedded structure.
>>
>> type
>> TMyRecord = record
>>a,b: integer;
>> end;
>>
>> TMyClass = class
>>FPadding: word;
>>FData: TMyRecord;
>>FOther: TMyRecord; // Can't use the type to search for FData
>>property ValA: integer read FData.a;
>> end;
>>
>> In that case DW_AT_Property_Forward  points to the member "a" in 
>> "TMyRecord". This would be a DW_TAG_Member, which would have a 
>> DW_AT_data_member_location relative to the structure FData address. However 
>> there is no address where MyRecord is stored.
> Interesting example! In this case a DW_TAG_property_getter needs to point to 
> the member "a" of the field FData specifically, so it can't be a reference to 
> the DW_TAG_member "a" in TMyRecord. I can't think of a good way of preserving 
> the access path of the field here. We could allow a DW_AT_location + 
> DW_AT_type to allow a consumer to derive the value, but not the access path. 
> In the general case (think something like "read 
> FBinaryTree.left.left.right.data") we lack the expressivity to preserve which 
> sub-field in the data structure, since DWARF does not encode expressions, 
> only types.
>
> I think it would be reasonable to allow the common case of referring to a 
> top-level field via a DW_TAG_member ref, and having an arbitrary DWARF 
> expression in a DW_AT_location to recover the location in all other cases.
>
> -- adrian
> --
> Dwarf-discuss mailing list
> Dwarf-discuss@lists.dwarfstd.org
> https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Proposal: Add support for "property" with getter/setter (based on Pascal properties)

2024-10-14 Thread Todd Allen via Dwarf-discuss
Playing devil's advocate here: Can you mix these two types of 
properties?  That is, something like the following (please excuse any 
wrong syntax):

type
TMyRecord = record
    a,b: integer;
end;

TMyClass = class
    function GetProp: TMyRecord;
    procedure SetProp(AVal: TMyRecord);
    property MyProp: TMyRecord read GetProp write SetProp;
    property MyPropA: integer read MyProp.a;
end;

That is, a read of TMyClass's MyPropA would call TMyClass.GetProp and 
then return only its member a?

On 10/13/24 04:15, Martin via Dwarf-discuss wrote:
>
> On 13/10/2024 00:59, Adrian Prantl via Dwarf-discuss wrote:
>> On Sep 30, 2024, at 1:39 PM, Martin  wrote:
>>>
>>> On 30/09/2024 19:30, Adrian Prantl wrote:
 PS: One thing I left out is DW_AT_Property_Object. It wasn't clear 
 to me why this wouldn't always be the address of the parent object 
 of the DW_TAG_property.
>>>
>>> There is a construct where a property can point to an embedded 
>>> structure.
>>>
>>> type
>>> TMyRecord = record
>>>    a,b: integer;
>>> end;
>>>
>>> TMyClass = class
>>>    FPadding: word;
>>>    FData: TMyRecord;
>>>    FOther: TMyRecord; // Can't use the type to search for FData
>>>    property ValA: integer read FData.a;
>>> end;
>>>
>>> In that case DW_AT_Property_Forward  points to the member "a" in 
>>> "TMyRecord". This would be a DW_TAG_Member, which would have a 
>>> DW_AT_data_member_location relative to the structure FData address. 
>>> However there is no address where MyRecord is stored.
>>
>> Interesting example! In this case a DW_TAG_property_getter needs to 
>> point to the member "a" of the field FData specifically, so it can't 
>> be a reference to the DW_TAG_member "a" in TMyRecord. I can't think 
>> of a good way of preserving the access path of the field here. We 
>> could allow a DW_AT_location + DW_AT_type to allow a consumer to 
>> derive the value, but not the access path. In the general case (think 
>> something like "read FBinaryTree.left.left.right.data") we lack the 
>> expressivity to preserve which sub-field in the data structure, since 
>> DWARF does not encode expressions, only types.
>>
>> I think it would be reasonable to allow the common case of referring 
>> to a top-level field via a DW_TAG_member ref, and having an arbitrary 
>> DWARF expression in a DW_AT_location to recover the location in all 
>> other cases.
>>
> Well, I don't think we need the "access path", as in full access path.
> As you pointed out "a" could be in a nested record of TMyRecord, but I
> can't think of any case where that access path would matter.
>
> If we add DW_AT_Location:
> - keep the normal fields in the property, including the refernce to "a"
> - add DW_AT_Location to overwrite the DW_AT_data_member_location
> then, yes we have all the info we need.
>
> There is a choice to make that DW_AT_Location point directly to the data
> of the field. Or to the location of the directly enclosing object.
>
> My original idea was to describe the location of "FData" with a location
> expression, and then the debugger still adds the
> DW_AT_data_member_location to that.
>
> Currently directly specifying the location of the "pointed to field" is
> the simplest way.
>
> And currently, in Pascal, the above only works for fields in a record.
> So the address of the record containing the field should never be
> needed. But if that ever is extended (or if any other language allows)
> to use a getter method from such an object, then the object address
> would be needed for the "this" parameter.
> So maybe it is saver to use the DW_AT_Location to specify the address of
> that containing object?
>
> In either case, the nested object can be different for getter/setter. So
> that additional address must given per getter/setter.
> -- 
> Dwarf-discuss mailing list
> Dwarf-discuss@lists.dwarfstd.org 

-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] bit-string

2025-01-02 Thread Todd Allen via Dwarf-discuss
I'm a bit late to the party, but here goes...

Our Ada compiler has support for this, described in terms of arrays.
It's similar to the Pascal example that someone provided.  If the only
way to get a type like this in PL/I is with a bit string, then the
producer could still describe this as an array, and a debugger could
"just know" about the syntactic sugar, and describe the type as a bit
string.

The tricky piece in your case, and this differed from our Ada
implementation, is that the string length (a.k.a. array upper bound,
possibly +1) is at the beginning of the object.  So you'd need to use a
DW_AT_data_location to inform a debugger to skip past that to get to the
actual bits.

On 12/18/24 13:39, Thomas David Rivers via Dwarf-discuss wrote:
> David Blaikie wrote:
>> What sort of string-like behavior would one want on a bit-string? Seems
>> like array would be the better fit to me...
>>
>   It's a language that originates in the late 60s - so there are some,
>   "different" ideas than more contemporary languages.
>
>   You can ask for sub-strings of strings (character or bit), you can
>   assign a string (shorter or of equal length) to a target string,
>   you can also concatenate strings, and you can assign to portions
>   within a target bit string (or character string for that matter.)
>
>   e.g.
>
>  test: proc;
>
>   dcl b10 bit(10);
>   dcl b20 bit(20) varying;
>   dcl dest bit(30) varying4;  /* maximum possible result size */
>
> b10 = '1010'b;  /* assigns '101000' to b10 (passing on the right */
> /* with 0 bits because the length is fixed. */
>
> b20 = '10'b;/* assigns '10' to b20, moving a byte containing  */
> /* '10..' and setting the 2-byte length field */
> /* to 2, indicating the left-most 2 bits are used. */
>
> dest = b10 || b20;  /* assigns '10100010' to dest, setting it's */
> /* 4-byte length field to 12. */
>
> b10 = substr(dest,1,2);  /* extracts the first 2 bits from the bitfield */
>/* dest ('10') (2 bytes starting a position #1) and assigns  */
>/* it to b10, which is extended to 10 bits on-the-right  */
>/* because b10 is a fixed-length, resulting in '10'  */
>/* being assigned to b10 */
>
> substr(b10,6,1) = '1'b;  /* assigns a 1-bit value to b10, starting */
>  /* at bit #6 and going for 1 bit, resulting */
>  /* in '101001' being in b10 */
>
>   end proc;
>
>   So - it's a rather abstract idea of "strings of bits" similar to
>   "strings of characters".
>
>   - Dave Rivers -
>
> --
> riv...@dignus.com
>
> --
> Dwarf-discuss mailing list
> Dwarf-discuss@lists.dwarfstd.org
> https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Proposal: Add support for "property" with getter/setter (based on Pascal properties)

2025-03-17 Thread Todd Allen via Dwarf-discuss
Adrian,

I noticed a wording duplication in my read-through this morning in 5.19(?), 
paragraph 5:
Some languages can automatically derive property derive accessors from ...
I think that 2nd "derived" needs to go.

Todd

-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Representing vtables in DWARF for downcasting

2025-05-02 Thread Todd Allen via Dwarf-discuss
FWIW, when we at Concurrent were in the compiler business, our C++ compilers 
generated two vendor-defined attributes, both hanging off the 
DW_TAG_{structure,class}_type.  Here are a couple with some sample locations:
DW_AT_vtable_location [DW_OP_plus_uconst 0; DW_OP_deref]
DW_AT_type_vtable_location [DW_OP_addr 0x12345678]
The first was a description of how to obtain the address of the vtable tag from 
an object.
The second was a description of the address of the vtable tag from just the 
type.

As we characterized them internally, they didn't have to be the address of the 
vtable proper.  They just had to be something that could be compared as a 
positive identification of the actual type.  I believe they always were the 
actual vtable addresses, though.  Because why not?

We do still have logic in our debugger to use them, too.  In addition to the 
mangling-based approaches.

It does require walking the whole DWARF tree to find them.

Todd

On 4/25/25 09:49, Jeremy Morse via Dwarf-discuss wrote:
Hi all,

The LLVM discussion linked [0] happens to be us Sony folks, and it's supporting 
the use-case Kyle described of automatic downcasting, i.e. identifying the 
most-derived-class of an object from its vtable pointer. Having to demangle the 
symbol table is a real pain (Tom, CC'd knows more) especially with things like 
anonymous namespaces.

Right now the approach is to have a top-level nameless global variable with the 
location set to the vtable address, and a DW_AT_specification linking into the 
class definition:

0x0082:   DW_TAG_variable
DW_AT_specification (0x00b6 "_vtable$")
DW_AT_alignment (8)
DW_AT_location  (DW_OP_addrx 0x1)

[Then deeper into the DIE tree,]

0x008b:   DW_TAG_structure_type
DW_AT_containing_type   (0x0034 "CBase")
DW_AT_calling_convention(DW_CC_pass_by_reference)
DW_AT_name  ("CDerived")
DW_AT_decl_file ("vtables.cpp")
DW_AT_decl_line (6)

[...]

0x00b6: DW_TAG_variable
  DW_AT_name("_vtable$")
  DW_AT_type(0x0081 "void *")
  DW_AT_external(true)
  DW_AT_declaration (true)
  DW_AT_artificial  (true)
  DW_AT_accessibility   (DW_ACCESS_private)

This works well enough for our own debugger use-cases; I agree with Cary that 
it's hacky to rely on the name of a variable to signify important information 
like this and an officially blessed way could help.

I've no opinion on the  DW_AT_vtable_elem_location behaviours, although we can 
consider it a separate issue.

[0] https://github.com/llvm/llvm-project/pull/130255

--
Thanks,
Jeremy




-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] EXTERNAL: Enhancement: Dynamic DW_AT_data_bit_offset

2025-04-21 Thread Todd Allen via Dwarf-discuss
Tom,

When I was adding DWARF support to my company's Ada compiler many years ago, 
using reference, exprloc, loclist, etc. forms on attributes that normally 
didn't allow them was a very common tactic.  Ada is not a language that DWARF 
claims to support fully (not even close), so I always assumed this fell under 
the vendor extensibility rules.  In the DWARF 5 document, 1.3.13, it says:
For language features that are not supported, implementors may use existing 
attributes in novel ways...
I read that as the permission you need.

Todd

On 4/17/25 13:58, Tom Tromey via Dwarf-discuss wrote:

CAUTION! External Email. Do not click links or open attachments unless you 
recognize the sender and are sure the content is safe.
If you think this is a phishing email, please report it by using the "Phish 
Alert Report" button in Outlook.

Consider the appended Ada program.  Here, the offset of "Another_Field"
is a non-constant number of bits from the start of the object.

I think there is no way to represent this in DWARF 5.  Section 5.7.6,
page 119 says:

For a DW_AT_data_bit_offset attribute, the value is an integer
constant (see Section 2.19 on page 55) that specifies the number of
bits from the beginning of the containing entity to the beginning of
the data member. This value must be greater than or equal to zero,
but is not limited to less than the number of bits per byte.

GNAT works around this using the deprecated-in-DWARF-4 DW_AT_bit_offset
in conjunction with DW_AT_data_member_location.  (You need a patch to
GNAT to see this in action.)

One way to fix this would be to lift the "integer constant" restriction
and allow an expression here.

thanks,
Tom


procedure Exam is
   type Small is range -7 .. -4;
   for Small'Size use 2;

   type Packed_Array is array (Integer range <>) of Small;
   pragma pack (Packed_Array);

   subtype Range_Int is Natural range 0 .. 7;

   type Some_Packed_Record (Discr : Range_Int := 3) is record
  Array_Field : Packed_Array (1 .. Discr);
  Field: Small;
  case Discr is
 when 3 =>
Another_Field : Small;
 when others =>
null;
  end case;
   end record;
   pragma Pack (Some_Packed_Record);
   pragma No_Component_Reordering (Some_Packed_Record);

   SPR : Some_Packed_Record := (Discr => 3,
Field => -4,
Another_Field => -6,
Array_Field => (-5, -6, -7));

begin
   null;
end Exam;
--
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.dwarfstd.org%2Fmailman%2Flistinfo%2Fdwarf-discuss&data=05%7C02%7Ctodd.allen%40concurrent-rt.com%7Ca781a7e71ed64864bf9508dd7dea4196%7C6cce74a3397545e09893b072988b30b6%7C0%7C0%7C638805167263000504%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=3jf43IH3e0McKfaHoa88%2F%2F8mVougVyg2zdBPoT%2FWbSA%3D&reserved=0


-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Representing vtables in DWARF for downcasting

2025-05-07 Thread Todd Allen via Dwarf-discuss
I think that's orthogonal to the point, which was that a rnglist is 
meant to describe a pc range, not a data range.

BTW, I don't think you need to create the vtable on the fly in that 
case.  The Pkg.Print function will need a static link, but the vtable 
doesn't need to encode that.  I just checked our Ada compiler.  We 
stopped development on this compiler after Ada 95, but the only change I 
had to make to your example to make it Ada 95-friendly, was the call to 
Print: Pkg.Print(Object).  (They were avoiding the Object.Function 
syntax in Ada 95, but I assume they relented and added some syntactic 
sugar in Ada 2005 or 2012.)  Our compiler does indeed generate a static 
vtable.

But if GNAT is generating the vtable at run-time, possibly even on the 
stack (maybe there's some other, more compelling, reason?), then we need 
to make sure the proposal isn't assuming a static location.

On 5/7/25 07:07, Pierre-Marie de Rodat wrote:
> Hello,
>
> On Wed, May 7, 2025 at 2:49 PM Todd Allen via Dwarf-discuss
>  wrote:
>> In 250506.2, the use of a rnglist is throwing me.  I would expect the 
>> lifetime of a vtable to be the whole program.  Or did you envision the 
>> rnglist to be the range of data/rodata addresses of the vtable object?  2.17 
>> clarifies that they're code addresses (i.e. text), though.
>>
>> We did have a discussion sometime in the last year about describing 
>> data/rodata address ranges, but that was in .debug_aranges (RIP).  And, 
>> IIRC, no actual compiler was generating data/rodata address there either.
> If it helps the design: there are languages where vtables are not
> necessarily statically allocated. Here is a small Ada example,
> involving a tagged type (equivalent to a C++ class) nested in a
> procedure, and with a primitive (C++ method) that actually has
> up-level references to the procedure locals (so the vtable is actually
> tied to the current stack frame):
>
>   1  with Ada.Text_IO; use Ada.Text_IO;
>   2
>   3  procedure Main is
>   4 Msg : constant String := "Hello world";
>   5
>   6 package Pkg is
>   7type T is tagged null record;
>   8procedure Print (Self : T);
>   9 end Pkg;
>  10
>  11 package body Pkg is
>  12procedure Print (Self : T) is
>  13begin
>  14   Put_Line (Msg);
>  15end Print;
>  16 end Pkg;
>  17
>  18 Object : Pkg.T;
>  19  begin
>  20 Object.Print;
>  21  end Main;
>
> GDB allows us to observe where the vtable for T is stored (tested on a
> x86_64-linux machine):
>
> $ gdb ./main
> (gdb) b main.adb:20
> Breakpoint 1 at 0x6ae9: file main.adb, line 20.
> (gdb) r
> […]
> Breakpoint 1, main () at main.adb:20
> 20 Object.Print;
> (gdb) set lang c
> Warning: the current language does not match this frame.
> (gdb) print object
> $1 = {_tag = 0x7fffda30}
> (gdb) p $rsp
> $2 = (void *) 0x7fffd7d0
> (gdb) p $rbp
> $3 = (void *) 0x7fffda70
>
> “_tag” is an artificial component for the record T that GCC
> (currently) generates in the debug info to materialize the vtable: it
> points to a structure that is in the current stack frame (between $rsp
> and $rbp).
>
> --
> Pierre-Marie de Rodat 


-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] EXTERNAL: Re: Representing vtables in DWARF for downcasting

2025-05-07 Thread Todd Allen via Dwarf-discuss
BTW, I don't think you need to create the vtable on the fly in that
case.  The Pkg.Print function will need a static link, but the vtable
doesn't need to encode that.  I just checked our Ada compiler.  We
stopped development on this compiler after Ada 95, but the only change I
had to make to your example to make it Ada 95-friendly, was the call to
Print: Pkg.Print(Object).  (They were avoiding the Object.Function
syntax in Ada 95, but I assume they relented and added some syntactic
sugar in Ada 2005 or 2012.)  Our compiler does indeed generate a static
vtable.



Interesting: this indeed works if Main.Pkg.Print calls take a static
link, but how can the caller find the static link to pass in the
general case? This is obvious in my previous example (the call happens
in the same scope that owns the static link), but thanks to type
derivation, calls to Main.Pkg.Print can actually appear in other
places. For instance:

package Base is
   type T is abstract tagged null record;
   procedure Print (Self : T) is abstract;
   procedure Call_Print (Self : T'Class);
   function Get_Msg return String;
end Base;

package body Base is
   procedure Call_Print (Self : T'Class) is
   begin
  Print (Self);
   end Call_Print;

   function Get_Msg return String is
   begin
  return "Hello world";
   end Get_Msg;
end Base;

with Ada.Text_IO; use Ada.Text_IO;
with Base;

procedure Main is
   Msg : constant String := Base.Get_Msg;

   package Pkg is
  type T is new Base.T with null record;
  overriding procedure Print (Self : T);
   end Pkg;

   package body Pkg is
  overriding procedure Print (Self : T) is
  begin
 Put_Line (Msg);
  end Print;
   end Pkg;

   Object : Pkg.T;
begin
   Base.Call_Print (Object);
end Main;

Since Main.Pkg.Print overrides a library-level primitive, it can’t
take a static link so the only way for it to have access to the
Main.Msg local is through Self. I guess a compiler could decide to put
the static link in each Main.Pkg.T object rather than in their vtable
(and thus have a static vtable), but as far as I can tell, GNAT stores
the static link in the vtable instead, so the vtable cannot be static.



That looks like the more compelling reason, then.

I tried this in our Ada 95 compiler, and it's rejected because of accessibility 
rules (13.9.1(3)).  And it looks like a legit rejection.  Those usually are 
designed to avoid dangling references of access types, but they used the same 
concept for type extensions (derived tagged types).  I don't know if the 
rationale was to avoid static link issues specifically, or if that was just a 
happy side effect.  But evidently they loosened the rules in a later language 
revision.

BTW, I sure hope objects of Main.Pkg.T cannot escape the invocation of Main!  
If so, it seems like you're moving into lambda closure territory.

Todd

-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] Representing vtables in DWARF for downcasting

2025-05-07 Thread Todd Allen via Dwarf-discuss
In 250506.2, the use of a rnglist is throwing me.  I would expect the lifetime 
of a vtable to be the whole program.  Or did you envision the rnglist to be the 
range of data/rodata addresses of the vtable object?  2.17 clarifies that 
they're code addresses (i.e. text), though.

We did have a discussion sometime in the last year about describing data/rodata 
address ranges, but that was in .debug_aranges (RIP).  And, IIRC, no actual 
compiler was generating data/rodata address there either.

On 5/6/25 19:19, Cary Coutant wrote:
I've written a three-part proposal to address these issues:

  *   The first part, 250506.1<https://dwarfstd.org/issues/250506.1.html>, 
proposes a standard mechanism for locating the virtual function table (vtable) 
given an object of a polymorphic class.
  *   The second part, 250506.2<https://dwarfstd.org/issues/250506.2.html>, 
proposes a standard mechanism for identifying the most-derived class of an 
object, given its vtable location, in order to support downcasting of pointers 
while debugging.
  *   The third part, 250506.3<https://dwarfstd.org/issues/250506.3.html>, 
proposes a fix to the DW_AT_vtable_elem_location attribute, which appears to be 
incorrectly implemented in compilers today.

-cary


On Fri, May 2, 2025 at 1:31 PM Todd Allen via Dwarf-discuss 
mailto:dwarf-discuss@lists.dwarfstd.org>> 
wrote:
FWIW, when we at Concurrent were in the compiler business, our C++ compilers 
generated two vendor-defined attributes, both hanging off the 
DW_TAG_{structure,class}_type.  Here are a couple with some sample locations:
DW_AT_vtable_location [DW_OP_plus_uconst 0; DW_OP_deref]
DW_AT_type_vtable_location [DW_OP_addr 0x12345678]
The first was a description of how to obtain the address of the vtable tag from 
an object.
The second was a description of the address of the vtable tag from just the 
type.

As we characterized them internally, they didn't have to be the address of the 
vtable proper.  They just had to be something that could be compared as a 
positive identification of the actual type.  I believe they always were the 
actual vtable addresses, though.  Because why not?

We do still have logic in our debugger to use them, too.  In addition to the 
mangling-based approaches.

It does require walking the whole DWARF tree to find them.

Todd

On 4/25/25 09:49, Jeremy Morse via Dwarf-discuss wrote:
Hi all,

The LLVM discussion linked [0] happens to be us Sony folks, and it's supporting 
the use-case Kyle described of automatic downcasting, i.e. identifying the 
most-derived-class of an object from its vtable pointer. Having to demangle the 
symbol table is a real pain (Tom, CC'd knows more) especially with things like 
anonymous namespaces.

Right now the approach is to have a top-level nameless global variable with the 
location set to the vtable address, and a DW_AT_specification linking into the 
class definition:

0x0082:   DW_TAG_variable
DW_AT_specification (0x00b6 "_vtable$")
DW_AT_alignment (8)
DW_AT_location  (DW_OP_addrx 0x1)

[Then deeper into the DIE tree,]

0x008b:   DW_TAG_structure_type
DW_AT_containing_type   (0x0034 "CBase")
DW_AT_calling_convention(DW_CC_pass_by_reference)
DW_AT_name  ("CDerived")
DW_AT_decl_file ("vtables.cpp")
DW_AT_decl_line (6)

[...]

0x00b6: DW_TAG_variable
  DW_AT_name("_vtable$")
  DW_AT_type(0x0081 "void *")
  DW_AT_external(true)
  DW_AT_declaration (true)
  DW_AT_artificial  (true)
  DW_AT_accessibility   (DW_ACCESS_private)

This works well enough for our own debugger use-cases; I agree with Cary that 
it's hacky to rely on the name of a variable to signify important information 
like this and an officially blessed way could help.

I've no opinion on the  DW_AT_vtable_elem_location behaviours, although we can 
consider it a separate issue.

[0] https://github.com/llvm/llvm-project/pull/130255

--
Thanks,
Jeremy





--
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org<mailto:Dwarf-discuss@lists.dwarfstd.org>
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss

-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss