On 7/15/20 9:49 PM, David Blaikie wrote:


On Wed, Jul 15, 2020 at 7:07 PM Michael Eager via Dwarf-Discuss <dwarf-discuss@lists.dwarfstd.org <mailto:dwarf-discuss@lists.dwarfstd.org>> wrote:

    Segmented addresses have been in the DWARF specification since
    Version 2
       and AFAIK have not been changed since that time.  DWARF V5 did
    not add
    any functionality to segmented addresses that was not present in DWARF
    V2/3.  At least, there was no intention to do so.  Segmented addresses
    are described in Section 2.12.

    A segmented address maps into a linear address in a processor-specific
fashion.

That seems at odds with the non-normative text of 2.12 "In some systems, addresses are specified as offsets within a given segment rather than as locations within a single flat address space."

That means that an x86 address could be represented as offset 0x1234 in segment 0x4444, which would translate to 0x44440+0x1234=0x45674. Note that x86 permits aliases, so that offset 0x0124 in segment 0x4555 is the same address.

DWARF sometimes uses wording which is intended to generalize a concept. Conceivably, another architecture could use the same DWARF attribute in a similar way. That's why the non-normative text says "some systems" rather than specifically referencing x86. But we do have that as the only example listed in the table.

And also would be confusing to me - if there is a contiguous linear address space, why would DWARF need to specify the use of a segment selector, and why do some references to addresses allow the inclusion of a segment selector and some don't? Why not just always use the non-segmented address description for DWARF?

So that a compiler can generate segment address values and offsets independently. An x86 code generator may not know what segment it is generating code in.

AFAIK, all addresses can be segmented addresses, except in the line table where it isn't needed.

Perhaps we should have (long ago) required flat/linear addresses for x86 instead of segmented addresses.

& I don't find any mention of this idea that some addresses are absolute and some are segment-relative in 2.12 - it does say that "If none of the entries in the chain of parents for this entry back to its containing compilation unit entry have DW_AT_segment attributes, then the entry is assumed to exist within a flat address space." - as though a flat (I assume this is synonymous with "linear"?) address space is distinct from the segmented address space being discussed otherwise?

Flat address space == linear address space.

From a certain perspective, x86 memory space is broken up into 65K 16-byte segments mapped onto a 256K linear address range.


    AFAIK, only the Intel 8086 and descendants have this
    functionality.  (It's a many to one mapping in the 8086 implementation,
    but that's a problem for a bygone era.)  There's a reference to i386
    memory models in Table 2.7.

    DWARF assumes a linear address space.  A segmented address maps to a
    specific address in this linear address space.  The entries in
    DW_AT_ranges for subprograms with different segment addresses would
    usually be referenced by their address in the linear address space.  If
    DW_AT_ranges has a DW_AT_segment, this is an indication that the
    debugger is to perform the processor-specific computation to translate
    the segment-address pair to the linear address.

    There is no need to do anything with segments in the line table, since
    the line table contains addresses in the linear address space.

    There is some (perhaps considerable) confusion in terminology in the
    x86
    world, because the x86 has multiple segment registers which on other
    processors would be called base registers.  The values in these
    registers reference memory segments and are added to whatever offset is
    contained in the program to generate an address.  These segment
    registers, and the memory segments which they point to, are NOT the
    segments represented by DW_AT_segment.

    Re "reading the segment selector" and "addrx encoding":  The addresses
    in DWARF DIEs are static, not dynamic.  There is no register+offset
    encoding, and processor registers are not read to determine where a
    subprogram is in memory.


Sorry, I don't quite follow the connections between all those statements.

Perhaps I didn't understand your comments about "reading the segment selector" and "addrx encoding".

TL;DR:
DW_AT_segment was designed to describe x86 memory model addresses: https://en.wikipedia.org/wiki/Intel_Memory_Model.

Possibly other architectures can use it, but I'm not familiar with any that do.


2.17 says that if a DIE has a DW_AT_high_pc and DW_AT_segment, then the high_pc is relative to the specified segment. That's a bit redundant if high_pc uses FORM_addrx, because the address in the address pool can specify its own segment, but a producer could choose which way to go there. (presumably if the AT_segment is there, you should interpret the addrx high_pc relative to that segment - assuming debug_addr has no segment selector in it - or perhaps it should go the other way and ignore the local AT_segment and only rely on whatever segment is in debug_addr)

DW_FORM_addrx (and the .debug_addr section) were introduced in DWARF V5 to allow compression of DW_FORM_addr addresses. DW_AT_segment is intended to describe an (x86) address in the form that the processor uses. The first is one of many different compression schemes in DWARF, the second is part of an architectural description.

    On 7/15/20 4:31 PM, David Blaikie via Dwarf-Discuss wrote:
     > Looking at how segment selectors work:
     >
     > DW_AT_segment: Applies to a DIE subtree, including any ranges,
    high/low
     > pc, locations, labels, etc
     > debug_range/loc (v4 and below): Doesn't seem to allow specifying
    segment
     > variation - inherits from the segment given on the nearest parent
    DIE
     > that refers to the entry
     > debug_rnglist/loclist (v5): includes segment selector size in the
     > header, but doesn't seem to use it - segment selection via
    addresses in
     > the address pool (RLE/LLE_*x encodings) would allow fine-grained
    segment
     > selection, but direct address forms don't seem to allow segment
     > selection ("This operand is the
     > 19 same size as used in DW_FORM_addr.")
     > debug_addr: segment_size in header, then list of {segment
    selector, address}
     > debug_aranges: segment_size in header says, then the list contains
     > triples of {segment selector, start address, length}
     > debug_line: v5 encodes the address and segment selector size in the
     > header, but I'm not sure if/how it's used. The DW_LNE_set_address
     > operation says:
     > "The DW_LNE_set_address opcode takes a single relocatable address
    as an
     > operand. The size of the operand is the size of an address on the
    target
     > machine. It sets the address register to the value given by the
     > relocatable address and sets the op_index register to 0." - doesn't
     > sound like it's reading the segment selector there.
     >
     > So... I don't think DWARFv5 made anything worse - if anything it did
     > enable /a/ way to use fine grained segment selectors in range
    lists and
     > location lists that doesn't appear, to me, to have been provided
    before.
     > (it could be needed if you had some functions in some segment and
    some
     > functions in another segment (which could be represented at the
     > subprogram DIE level - DW_AT_segment 1 on one DW_TAG_subprogram,
     > DW_AT_segment 2 on another DW_TAG_subprogram - but how would you
     > represent the DW_AT_ranges for this CU (in DWARFv4, or in DWARFv5
     > without using addrx encodings)? I don't know how, because I think
     > debug_ranges could describe one range list entry as being from one
     > segment, and another range list entry as being in another segment
    - they
     > would all be in whatever segment was in DW_AT_segment on the CU)
     >
     > does that make sense? Have I missed something about how you could
    use
     > segment selectors in a debug_loc, debug_ranges, or
    loclist/rnglist that
     > isn't using an addrx encoding?
     >
     > On Wed, Jul 15, 2020 at 6:37 AM Robinson, Paul via Dwarf-Discuss
     > <dwarf-discuss@lists.dwarfstd.org
    <mailto:dwarf-discuss@lists.dwarfstd.org>
     > <mailto:dwarf-discuss@lists.dwarfstd.org
    <mailto:dwarf-discuss@lists.dwarfstd.org>>> wrote:
     >
     >
     >
     >      > -----Original Message-----
     >      > From: Dwarf-Discuss
    <dwarf-discuss-boun...@lists.dwarfstd.org
    <mailto:dwarf-discuss-boun...@lists.dwarfstd.org>
     >     <mailto:dwarf-discuss-boun...@lists.dwarfstd.org
    <mailto:dwarf-discuss-boun...@lists.dwarfstd.org>>> On Behalf
     >      > Of Xing GUO via Dwarf-Discuss
     >      > Sent: Tuesday, July 14, 2020 10:39 PM
     >      > To: dwarf-discuss@lists.dwarfstd.org
    <mailto:dwarf-discuss@lists.dwarfstd.org>
     >     <mailto:dwarf-discuss@lists.dwarfstd.org
    <mailto:dwarf-discuss@lists.dwarfstd.org>>
     >      > Subject: [Dwarf-Discuss] Segment selectors for the range
    list table.
     >      >
     >      > Hi there,
     >      >
     >      > The DWARFv5 spec mentioned that there might be segment
    selectors in
     >      > the range list entries and when the segment_selector_size
    is 0, the
     >      > segment selectors are omitted from the range list entries.
     >     However, it
     >      > didn't mention how the segment selector should be encoded
    when the
     >      > segment_selector_size isn't 0. Can anyone help me figure
    it out?
     >      > Thanks a lot!
     >
     >     Hi Xing,
     >
     >     The segment selectors in the range list would be encoded the
    same way
     >     as they would be in the main .debug_info section.  Range
    lists and
     >     location lists are essentially extensions to .debug_info, for
    cases
     >     where the range or location cannot be represented by simple
    DW_AT_*
     >     attribute values.
     >
     >     The specifics of encoding the segment selector would be
    whatever is
     >     appropriate to the target.  DWARF does not specify these details.
     >
     >     Best Regards,
     >     --paulr
     >
     >
     >      >
     >      > 7.28 (page 243)
     >      > The segment size is given by the segment_selector_size
    field of the
     >      > header, and the address size is given by the address_size
    field
     >     of the
     >      > header. If the segment_selector_size field in the header
    is zero, the
     >      > segment selector is omitted from the range list entries.
     >      >
     >      > --
     >      > Cheers,
     >      > Xing



-- Michael Eager
    _______________________________________________
    Dwarf-Discuss mailing list
    Dwarf-Discuss@lists.dwarfstd.org
    <mailto:Dwarf-Discuss@lists.dwarfstd.org>
    http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


--
Michael Eager
_______________________________________________
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org

Reply via email to