Anyone know of any existing DWARF producers that use segmented addressing? Might be interesting to see how they're using the features.
On Thu, Jul 16, 2020 at 9:30 AM Michael Eager <ea...@eagercon.com> wrote: > On 7/15/20 9:49 PM, David Blaikie wrote: > > > > > > On Wed, Jul 15, 2020 at 7:07 PM Michael Eager via Dwarf-Discuss > > <dwarf-discuss@lists.dwarfstd.org > > <mailto:dwarf-discuss@lists.dwarfstd.org>> wrote: > > > > Segmented addresses have been in the DWARF specification since > > Version 2 > > and AFAIK have not been changed since that time. DWARF V5 did > > not add > > any functionality to segmented addresses that was not present in > DWARF > > V2/3. At least, there was no intention to do so. Segmented > addresses > > are described in Section 2.12. > > > > A segmented address maps into a linear address in a > processor-specific > > fashion. > > > > > > That seems at odds with the non-normative text of 2.12 "In some systems, > > addresses are specified as offsets within a given segment rather than as > > locations within a single flat address space." > > That means that an x86 address could be represented as offset 0x1234 in > segment 0x4444, which would translate to 0x44440+0x1234=0x45674. Note > that x86 permits aliases, so that offset 0x0124 in segment 0x4555 is the > same address. > > DWARF sometimes uses wording which is intended to generalize a concept. > Conceivably, another architecture could use the same DWARF attribute in > a similar way. That's why the non-normative text says "some systems" > rather than specifically referencing x86. But we do have that as the > only example listed in the table. > > > And also would be confusing to me - if there is a contiguous linear > > address space, why would DWARF need to specify the use of a segment > > selector, and why do some references to addresses allow the inclusion of > > a segment selector and some don't? Why not just always use the > > non-segmented address description for DWARF? > > So that a compiler can generate segment address values and offsets > independently. An x86 code generator may not know what segment it is > generating code in. > But parts of the format (such as debug_ranges - sort of, and debug_line) assume the DWARF producer does know the absolute address. It seems inconsistent to me. Perhaps it's more like Paul was postulating - that the spec assumes code is in a code segment/doesn't need to be clarified. (but that gets a bit confused in debug_aranges - if it only is meant to contain code (not data), why does it need a segment selector - and also in the DIEs - if code is always in a known/assumable segment then why can you vary segment for low_pc/high_pc/ranges?) > > AFAIK, all addresses can be segmented addresses, except in the line > table where it isn't needed. > > Perhaps we should have (long ago) required flat/linear addresses for x86 > instead of segmented addresses. > What's the line table's segment_selector_size (in the DWARFv5 header) for? (this sort of agrees and disagrees with you - it's there, but it's not used in any part of the debug_line format that I can see) What I'm confused by is how you can use segmented addressing on a per-DIE-subtree basis (eg: one subprogram can specify one segment, and another subprogram can specify a different segment - the spec seems to be pretty clear that this is intentionally supported (says that it applies to high_pc/low_pc/ranges, and that it delegates to the nearest parent DIE's DW_AT_segment)) but it doesn't seem like you can have a range list (in v4 or v5 (except via the addrx encodings)) that can have different segments for different subranges within a single range list. So how would you describe the ranges of a CU that had subprograms in distinct segments? And if you can't describe that - then why does the spec go out of its way to explicitly allow it with the wording about segment overrides, etc. > > > & I don't find any mention of this idea that some addresses are absolute > > and some are segment-relative in 2.12 - it does say that "If none of the > > entries in the chain of parents for this entry back to its containing > > compilation unit entry have DW_AT_segment attributes, then the entry is > > assumed to exist within a flat address space." - as though a flat (I > > assume this is synonymous with "linear"?) address space is distinct from > > the segmented address space being discussed otherwise? > > Flat address space == linear address space. > > From a certain perspective, x86 memory space is broken up into 65K > 16-byte segments mapped onto a 256K linear address range. > AFAIK, only the Intel 8086 and descendants have this > > functionality. (It's a many to one mapping in the 8086 > implementation, > > but that's a problem for a bygone era.) There's a reference to i386 > > memory models in Table 2.7. > > > > DWARF assumes a linear address space. A segmented address maps to a > > specific address in this linear address space. The entries in > > DW_AT_ranges for subprograms with different segment addresses would > > usually be referenced by their address in the linear address space. > If > > DW_AT_ranges has a DW_AT_segment, this is an indication that the > > debugger is to perform the processor-specific computation to > translate > > the segment-address pair to the linear address. > > > > There is no need to do anything with segments in the line table, > since > > the line table contains addresses in the linear address space. > > > > There is some (perhaps considerable) confusion in terminology in the > > x86 > > world, because the x86 has multiple segment registers which on other > > processors would be called base registers. The values in these > > registers reference memory segments and are added to whatever offset > is > > contained in the program to generate an address. These segment > > registers, and the memory segments which they point to, are NOT the > > segments represented by DW_AT_segment. > > > > Re "reading the segment selector" and "addrx encoding": The > addresses > > in DWARF DIEs are static, not dynamic. There is no register+offset > > encoding, and processor registers are not read to determine where a > > subprogram is in memory. > > > > > > Sorry, I don't quite follow the connections between all those statements. > > Perhaps I didn't understand your comments about "reading the segment > selector" and "addrx encoding". > > TL;DR: > DW_AT_segment was designed to describe x86 memory model addresses: > https://en.wikipedia.org/wiki/Intel_Memory_Model. > > Possibly other architectures can use it, but I'm not familiar with any > that do. > > > > 2.17 says that if a DIE has a DW_AT_high_pc and DW_AT_segment, then the > > high_pc is relative to the specified segment. That's a bit redundant if > > high_pc uses FORM_addrx, because the address in the address pool can > > specify its own segment, but a producer could choose which way to go > > there. (presumably if the AT_segment is there, you should interpret the > > addrx high_pc relative to that segment - assuming debug_addr has no > > segment selector in it - or perhaps it should go the other way and > > ignore the local AT_segment and only rely on whatever segment is in > > debug_addr) > > DW_FORM_addrx (and the .debug_addr section) were introduced in DWARF V5 > to allow compression of DW_FORM_addr addresses. DW_AT_segment is > intended to describe an (x86) address in the form that the processor > uses. The first is one of many different compression schemes in DWARF, > the second is part of an architectural description. > debug_addr supports segment selectors - in the debug_addr header it has a field for "segment selector size" and the entries in the address list are "segment/address pairs.". So now there's two ways a segment selector for an address could be specified - if you had a DW_TAG_subprogram with a DW_AT_low_pc using addrx into a debug_addr with a non-zero segment selector and the subprogram also had a DW_AT_segment, wonder which one's meant to win. Though mostly my point was: since debug_addr entries can have segment selectors, then debug_rnglists can have different segments for different subranges within a singular range list. But without that (either using direct addresses, or in v4 debug_ranges) you couldn't vary segment across a single range list. Though the debug_rnglist header does have a segment selector size in it - it doesn't seem to use it anywhere in its format (similarly, debug_loclists and debug_line v5 has a segment selector size, but doesn't seem to use it?). > > > On 7/15/20 4:31 PM, David Blaikie via Dwarf-Discuss wrote: > > > Looking at how segment selectors work: > > > > > > DW_AT_segment: Applies to a DIE subtree, including any ranges, > > high/low > > > pc, locations, labels, etc > > > debug_range/loc (v4 and below): Doesn't seem to allow specifying > > segment > > > variation - inherits from the segment given on the nearest parent > > DIE > > > that refers to the entry > > > debug_rnglist/loclist (v5): includes segment selector size in the > > > header, but doesn't seem to use it - segment selection via > > addresses in > > > the address pool (RLE/LLE_*x encodings) would allow fine-grained > > segment > > > selection, but direct address forms don't seem to allow segment > > > selection ("This operand is the > > > 19 same size as used in DW_FORM_addr.") > > > debug_addr: segment_size in header, then list of {segment > > selector, address} > > > debug_aranges: segment_size in header says, then the list contains > > > triples of {segment selector, start address, length} > > > debug_line: v5 encodes the address and segment selector size in > the > > > header, but I'm not sure if/how it's used. The DW_LNE_set_address > > > operation says: > > > "The DW_LNE_set_address opcode takes a single relocatable address > > as an > > > operand. The size of the operand is the size of an address on the > > target > > > machine. It sets the address register to the value given by the > > > relocatable address and sets the op_index register to 0." - > doesn't > > > sound like it's reading the segment selector there. > > > > > > So... I don't think DWARFv5 made anything worse - if anything it > did > > > enable /a/ way to use fine grained segment selectors in range > > lists and > > > location lists that doesn't appear, to me, to have been provided > > before. > > > (it could be needed if you had some functions in some segment and > > some > > > functions in another segment (which could be represented at the > > > subprogram DIE level - DW_AT_segment 1 on one DW_TAG_subprogram, > > > DW_AT_segment 2 on another DW_TAG_subprogram - but how would you > > > represent the DW_AT_ranges for this CU (in DWARFv4, or in DWARFv5 > > > without using addrx encodings)? I don't know how, because I think > > > debug_ranges could describe one range list entry as being from one > > > segment, and another range list entry as being in another segment > > - they > > > would all be in whatever segment was in DW_AT_segment on the CU) > > > > > > does that make sense? Have I missed something about how you could > > use > > > segment selectors in a debug_loc, debug_ranges, or > > loclist/rnglist that > > > isn't using an addrx encoding? > > > > > > On Wed, Jul 15, 2020 at 6:37 AM Robinson, Paul via Dwarf-Discuss > > > <dwarf-discuss@lists.dwarfstd.org > > <mailto:dwarf-discuss@lists.dwarfstd.org> > > > <mailto:dwarf-discuss@lists.dwarfstd.org > > <mailto:dwarf-discuss@lists.dwarfstd.org>>> wrote: > > > > > > > > > > > > > -----Original Message----- > > > > From: Dwarf-Discuss > > <dwarf-discuss-boun...@lists.dwarfstd.org > > <mailto:dwarf-discuss-boun...@lists.dwarfstd.org> > > > <mailto:dwarf-discuss-boun...@lists.dwarfstd.org > > <mailto:dwarf-discuss-boun...@lists.dwarfstd.org>>> On Behalf > > > > Of Xing GUO via Dwarf-Discuss > > > > Sent: Tuesday, July 14, 2020 10:39 PM > > > > To: dwarf-discuss@lists.dwarfstd.org > > <mailto:dwarf-discuss@lists.dwarfstd.org> > > > <mailto:dwarf-discuss@lists.dwarfstd.org > > <mailto:dwarf-discuss@lists.dwarfstd.org>> > > > > Subject: [Dwarf-Discuss] Segment selectors for the range > > list table. > > > > > > > > Hi there, > > > > > > > > The DWARFv5 spec mentioned that there might be segment > > selectors in > > > > the range list entries and when the segment_selector_size > > is 0, the > > > > segment selectors are omitted from the range list entries. > > > However, it > > > > didn't mention how the segment selector should be encoded > > when the > > > > segment_selector_size isn't 0. Can anyone help me figure > > it out? > > > > Thanks a lot! > > > > > > Hi Xing, > > > > > > The segment selectors in the range list would be encoded the > > same way > > > as they would be in the main .debug_info section. Range > > lists and > > > location lists are essentially extensions to .debug_info, for > > cases > > > where the range or location cannot be represented by simple > > DW_AT_* > > > attribute values. > > > > > > The specifics of encoding the segment selector would be > > whatever is > > > appropriate to the target. DWARF does not specify these > details. > > > > > > Best Regards, > > > --paulr > > > > > > > > > > > > > > 7.28 (page 243) > > > > The segment size is given by the segment_selector_size > > field of the > > > > header, and the address size is given by the address_size > > field > > > of the > > > > header. If the segment_selector_size field in the header > > is zero, the > > > > segment selector is omitted from the range list entries. > > > > > > > > -- > > > > Cheers, > > > > Xing > > > > > > > > -- > > Michael Eager > > _______________________________________________ > > Dwarf-Discuss mailing list > > Dwarf-Discuss@lists.dwarfstd.org > > <mailto:Dwarf-Discuss@lists.dwarfstd.org> > > http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org > > > > -- > Michael Eager >
_______________________________________________ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org