Re: [Dwarf-Discuss] question on address spaces

Michael Eager Fri, 16 Oct 2015 09:30:06 -0700

On 10/05/2015 11:53 PM, Ashutosh Pal wrote:

Hi Dwarf Experts,


We have processor architectures that have multiple overlapping address spaces. 
Our compiler
tool-chain can map data variables (locals and globals) to these address spaces. 
For example:

int GMEMX array1[100];  // allocates to GMEMX

int GMEMY array2[100];  // allocates to GMEMY

void foo() {

   vint8 vec1 = init_vec();  // allocates to stack on VMEM

   int a = get_element(vec1, 1);   // allocates to stack on SMEM

..

}

In the above example, there are 4 overlapping address-spaces (GMEMX, GMEMY, 
VMEM and SMEM) each has
one variable mapped to it respectively. To uniquely identify the variable 
inside debugging
information, we would like to express the location of variables using a tuple 
<address-space-id,
local-address> where the ‘address-space-id’ is an integer uniquely identifying 
the address-space,
while ‘local-address’ is an expression that evaluates to an address with-in the 
corresponding
address space.

So, we are looking for ways using which the locations of these variables can be 
expressed in the
dwarf information. We found two features in the dwarf4 specification, but with 
both we see drawbacks:

1.DW_AT_segment: This is an attribute that can be specified on top of variable 
dies. We can use this
attribute to encode our ‘address-space-id’ and use it in tandem with 
DW_AT_location to describe the
required tuple. But, with this we cannot cover those cases where a variable can 
reside in 2
address-spaces in its whole life time; for this, we would need an operator 
describing the
address-space-id in the location description itself.


DW_AT_segment was designed to represent i386-style addressing where a memory
address was represented by two pieces, a segment or page address and an offset
within the segment.  There are several 16-bit architectures which use similar
schemes to address more physical memory than can be addressed by a 
register-sized
pointer.  The understanding is that a physical memory address can be 
arithmetically
computed from the segment and offset.

There's no need for a segment to be encoded in a location expression with one
of these segmented architectures, since the locations are physical addresses.

While DW_AT_segment might be used to represent an address space id, this is
a bit different concept from a segment base address.  In general, a physical
address cannot be computed from an [ASID, address] pair.

I don't think it is made explicit in the DWARF standard, but any issues with
aliasing of addresses, where there are different [segment, offset] 
representations
of the same physical address, is something that is defined by the architecture
and there is no specific support for this in DWARF.  In the i386, it was easy to
create different [segment, offset] representations for the same physical 
address,
but these were similarly easily identified.  On other [page, offset] 
architectures,
this was less an issue.

2.DW_OP_xderef: The is the only operator that allows encoding an 
address-space-id along with an
address. But we see 2 problems in expressing the location of say a local 
variable residing on stack
at offset 20 on address-space-id 4:

a. {DW_OP_bregx SP 20} {4} DW_OP_xderef: Evaluating this expression returns the 
value of the local
variable and NOT its location.

b.The specification also says that the size of the dereferenced value returned 
by DW_OP_xderef
should be less than or equal to the size of the address on the target. While in 
our case the
dereferenced values could bigger than the address. This further prohibits us in 
applying
DW_OP_stack_value operator on the above expression


DWARF assumes that memory has unique physical addresses which are computable
and which can be used in a computation, for example, to index into an array.

DW_OP_xderef performs the implementation-defined computation to convert a
[segment, offset] pair into a physical memory address.  As such, the result
is a single value, limited in size to that of a physical memory address.

Are there other possibilities in the dwarf4 standard that we overlooked for the 
above scenarios?
Also, any pointers to where people might have already solved such issues 
earlier, would be useful to
have.


One thought is that you expand your concept of address to incorporate
an ASID, rather look at the address as a tuple.  For example, if you have a
64-bit address, the top 8 or 16 bits might be the ASID, while the remaining
48 or 56 bits represent an address within that ASID.  As long as you do not
have overflow from the address into the ASID, there should be no trouble in
generating location expressions.  This doesn't address issues related to
aliasing or overlapping address spaces.

A more comprehensive representation of multiple address spaces, including
the possibility that they might overlap or that a variable might exist in
multiple address spaces at the same time, would seem to require a large number
of changes to DWARF.


--
Michael Eager    ea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

_______________________________________________
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org

Re: [Dwarf-Discuss] question on address spaces

Reply via email to