Hi Martin,

AMD faced the same problem you are describing when generating DWARF for optimized code running on AMD GPUs. After considering several alternatives, they came up with a solution that is described in complete detail here:

https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html

One of the key concepts in the extensions, which I think addresses the problem you described, is to make locations (memory, register, composite, implicit, and undefined) first-class objects that can be pushed onto the DWARF evaluation stack. In other words, instead of making a location the result of an evaluation, locations can be pushed, popped, and operated on just like addresses and values in the current model. The extensions are backwards compatible with DWARF 5 and add only a handful of operators. I find the design and formal specification of the model quite elegant. Further, I think the extensions are generally useful, not just specific to GPUs.

The ROCGDB and TotalView debuggers have both implemented the model from the consumers' side, and the AMD HIP/LLVM compilers produces a subset of the extensions for stack unwinding. I believe that AMD is working on further modifications to support variable locations.

About a month ago, Tony Tye from AMD did a presentation (DWARF Extensions for Optimized SIMT/SIMD (GPU) Debugging) at the Linux Plumbers Conference 2021, GNU Tools Cauldron. The presentation is here on YouTube:

https://youtu.be/QiR0ra0ymEY?t=10040

It starts at about time 2:47:40 and is 30 minutes long. I think it does a good job at visually explaining the extensions, so you might want to watch it before reading the extensions document.

Cheers, John D.


On 10/24/21 12:58 PM, Martin via Dwarf-Discuss wrote:
The problem was already described here
https://nam12.safelinks.protection.outlook.com/?url="">
But the replies in that thread only answered the case for DW_AT_frame_base


However, if you have a data-type like described

- At "Address of variable": e.g. pointer to 0x1234
- At 0x1234: some data, e.g. a structure or array

Note, that the address of the variable can not be 0x1234, as taking the
address of the variable "&var" should return a pointer to the first block.

The variable can have a DW_AT_location
- that is an address in memory
- the name of a register
- a location list, that returns sometimes an address,
  sometimes a register

In order to access data at 0x1234, the data type would have an
DW_AT_data_location like:
  DW_OP_push_object_address, DW_OP_deref, ...
(maybe followed by an adding an offset or other OPs)

But that only works for memory locations. (at least various emails on
the list suggest this)


How can such a type be represented?

Since DW_OP_push_object_address is documented
The DW_OP_push_object_address operation pushes the address of the object
3 currently being evaluated

it would be good if DW_OP_deref would always work.

DW_OP_regN
The DW_OP_reg<n> operations encode the names of up to 32 registers,
8 numbered from 0 through 31, inclusive. The object addressed is in register n

The last sentence can be read that this is "the address" of that register.

If so, then maybe DW_OP_deref should be documented to fetch the value in
the register.

The above example also illustrates why the register-content should be
the result of the dereferencing, as it leads to the same final address
(and that is correct if it is the same variable to start with)


And there is more:
The part of the variable data that is 0x1234 could be stored across
several registers. Then the DW_AT_location of the variable would contain
several DW_OP_piece.

There is an example that gives some clue what should happen, if a
DW_AT_location returns a value like this.
Page 291 (Dwarf 5)
DW_OP_lit1 DW_OP_stack_value DW_OP_piece 4 DW_OP_breg3 0 DW_OP_breg4 0
1110 DW_OP_plus DW_OP_stack_value DW_OP_piece 4
The object value is found in an anonymous (virtual) location

So DW_OP_piece also returns a location (albeit a virtual one). That
could then also be dereferenced by DW_OP_deref (or maybe it already can?).

IMHO, that should be explicitly clarified. Or alternatives should be given.


** Actual real world example.

FPC (Free Pascal) has several types, that have such an hidden pointer.
(AnsiString (array of char), dynamic array, objects)

The "pointer part" is being passed around as the variable (e.g. as a
function parameter, or on assignment).
All variables that have the same value in the "pointer part" share the
same data in the "pointed to part".

So the pointer-part could be ending up in a register. But the type must
still be able to describe the data location.
_______________________________________________
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
https://nam12.safelinks.protection.outlook.com/?url="">


CAUTION: This email originated from outside of the organization. Do not click on links or open attachments unless you recognize the sender and know the content is safe.

This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately.


_______________________________________________
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org

Reply via email to