Re: [Dwarf-Discuss] Interaction between aranges and unit proposals
Hi Eric, On Tue, 2014-04-01 at 16:51 -0700, Eric Christopher wrote: > On Tue, Apr 1, 2014 at 4:38 AM, Mark Wielaard wrote: > > Is there a way to reconcile these proposals so they keep the benefit of > > both (quick/complete address scan without having to load/parse bulk data > > and simplifying the DWARF data structures by combining various units in > > one section)? > > > Absolutely a fan. Knowing what various consumers need is going to be > key for any tables to speed up access. So for the .debug_aranges table the two proposals try to make it possible for a consumer to quickly create a table of address ranges that describe which part of the .debug_info might be needed to read when an address is encountered without having to actually read any of the .debug_info/abbrev at all (if possible). There are two reasons this currently cannot be done. First producers often just skip generating an aranges entry for units that don't cover any addresses, so you'll don't know whether it was just not generated in the first place or really is empty. That is what issue 100430.2 tries to address, GCC was changed to follow this recommendation. Secondly you can sadly not be sure that all producers follow the previous recommendation (it is deemed a quality of service matter whether an aranges entry is generated for a CU) so if you have a module that combined the output of various producers you need a way to check they all really produced aranges entries for all the units. That is what issue 100430.1 tries to address. By adding a unit length field like other tables have you can just scan the aranges headers, check there are no gaps of uncovered debug_info data and not have to even try to load the .debug_info/.debug_abbrev data in that case. Of course if you do find a gap you still need to read in and scan through all the unit data itself, but at least you know you are doing it on purpose and only for those modules that were generated by producers that don't generate aranges for all units. GDB noticed this really matters for larger programs with lots of modules, just having to map in all and scan through the .debug sections you might not need creates a big (startup) delay. > > One way might be to reverse the last proposal. Instead of removing the > > aranges for type units (which did indeed not make much sense in the > > split .debug_info/.debug_type approach), add an empty aranges header if > > a type unit appears in .debug_info in the way of the second proposal for > > address-less CUs. > > > We could do this, but I think adding one for every type unit would be > a bit wasteful. Since type units are going to have a flag in the > header would it be possible for you to notice that when looking > through the units? I'm not sure how you know that you have complete > coverage so I'm just throwing out words here, could you provide a bit > of a description of how this works for me if you don't mind? You are right. It certainly is a trade-off. The goal is to not have to read any of the unit data if at all possible. With the type units separate in .debug_types that was easy. Maybe the solution is to have an alternate .debug_aranges header just for empty units that is as small as possible? Or reuse the existing header fields as "flag"? Maybe have the proposed header format of issue 100430.1 but if address_size and segment_size are both zero then no address range descriptor will be added and that headers signals a "no-address" unit? Cheers, Mark ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
Re: [Dwarf-Discuss] Default Location List Entry Issue 130121.1
On Tue, 2014-04-01 at 18:42 -0700, Michael Eager wrote: > On 04/01/14 13:54, Mark Wielaard wrote: > > > What about using the presence of a DW_AT_external attribute on the data > > object that has a single location expression to know whether the described > > location is valid/visible outside of the enclosing lexical scope? > > > > Using that or some new flag (DW_AT_global_scope) to mark a data object > > that has a single location description with global scope might be cheaper > > than encoding it with a location list pointer and a default entry. > > Both DW_AT_external attribute and a hypothetical DW_AT_global_scope > attribute would describe scope, not the storage life of the object. > C unfortunately has confabulated the two concepts. In practice I believe DW_AT_external does both and I have used it that way to know whether to trust a single location description is globally valid. But yes, just like lifetime both "external" and "scope" are again bad words to use in this case and don't really express the thing we want (they might only accidentally). What about we call the flag attribute DW_AT_global_visible or DW_AT_global_location then it would be clear I think because it signals this isn't about language or DWARF tree (lexical) scoping? > Objects which have only a single location can be described with a location > expression. They don't need a location list with a default entry. For a data object that has a single location descriptor (DW_AT_location in exprloc class form), the valid range is given by the address ranges of the DIEs that own the data object. So it only lets me express the location in a restricted range. I do need to use a location list with (just) a default entry if I want to indicate that the location description has a valid global range. Is the above correct? Or is there another way to express that a location description is globally valid? Or am I misunderstanding the purpose of having a default entry in a location list? Thanks, Mark ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
Re: [Dwarf-Discuss] Interaction between aranges and unit proposals
On Wed, 2014-04-02 at 12:18 +0200, Mark Wielaard wrote: > Maybe the solution is to have an alternate .debug_aranges header just > for empty units that is as small as possible? Or reuse the existing > header fields as "flag"? Maybe have the proposed header format of issue > 100430.1 but if address_size and segment_size are both zero then no > address range descriptor will be added and that headers signals a > "no-address" unit? I forgot, there is another "solution". You could try to be not as pedantically correct as GDB is following the DWARF standard. elfutils tools like eu-addr2line and the libdwfl library functions to map addresses to debug lines or DIEs for example just assume aranges isn't an optional thing and that it will always be complete. That makes the elfutils tools a lot faster than GDB, but obviously not as universal (they just fail to match if no aranges are found). For this to work for other tools however the indexes should be "upgraded" from quality of service to mandatory. Cheers, Mark ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
Re: [Dwarf-Discuss] Default Location List Entry Issue 130121.1
On 04/02/14 03:43, Mark Wielaard wrote: On Tue, 2014-04-01 at 18:42 -0700, Michael Eager wrote: On 04/01/14 13:54, Mark Wielaard wrote: What about using the presence of a DW_AT_external attribute on the data object that has a single location expression to know whether the described location is valid/visible outside of the enclosing lexical scope? Using that or some new flag (DW_AT_global_scope) to mark a data object that has a single location description with global scope might be cheaper than encoding it with a location list pointer and a default entry. Both DW_AT_external attribute and a hypothetical DW_AT_global_scope attribute would describe scope, not the storage life of the object. C unfortunately has confabulated the two concepts. In practice I believe DW_AT_external does both and I have used it that way to know whether to trust a single location description is globally valid. But yes, just like lifetime both "external" and "scope" are again bad words to use in this case and don't really express the thing we want (they might only accidentally). Section 3.3.1 describes the DW_AT_external with the following: If the name of the subroutine described by an entry with the tag DW_TAG_subprogram is visible outside of its containing compilation unit, that entry has a DW_AT_external attribute, which is a flag. Section 4.1 contains the following (describing an object): 2. A DW_AT_external attribute, which is a flag, if the name of a variable is visible outside of its enclosing compilation unit. There is nothing about the lifetime of any object. Confabulating that the scope attributes in DWARF imply something different is not supported by the DWARF standard and will lead to misinterpretation of the DWARF standard. What about we call the flag attribute DW_AT_global_visible or DW_AT_global_location then it would be clear I think because it signals this isn't about language or DWARF tree (lexical) scoping? Symbol visibility is adequately described by the DW_AT_external attribute as defined above. Object locations are adequately described by location expressions and location lists. Objects which have only a single location can be described with a location expression. They don't need a location list with a default entry. For a data object that has a single location descriptor (DW_AT_location in exprloc class form), the valid range is given by the address ranges of the DIEs that own the data object. So it only lets me express the location in a restricted range. I do need to use a location list with (just) a default entry if I want to indicate that the location description has a valid global range. No, this interpretation is incorrect and is not supported by reading the DWARF Standard. Please see Section 2.6. Perhaps you are confused by the following from Section 2.6 (which I think is unambiguous): 1. Single location descriptions, which are a language independent representation of addressing rules of arbitrary complexity built from DWARF expressions and/or other DWARF operations specific to describing locations. They are sufficient for describing the location of any object as long as its lifetime is either static or the same as the lexical block that owns it, and it does not move during its lifetime. You seem to have interpreted "either/or" in the second sentence to mean "and". Is the above correct? Or is there another way to express that a location description is globally valid? Or am I misunderstanding the purpose of having a default entry in a location list? You do not need a location list for a static object. A simple location expression will suffice. A default location list entry (as proposed in 130121.1) gives the location of an object for address values which are not otherwise specified in the location list. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
Re: [Dwarf-Discuss] Interaction between aranges and unit proposals
> To make it possible to quickly see whether an address (range) is covered > by an ELF file containing DWARF information two proposals were made: > > aranges does not have debug info length > http://dwarfstd.org/ShowIssue.php?issue=100430.1 > > debug_aranges and address-less CUs > http://dwarfstd.org/ShowIssue.php?issue=100430.2 We dropped the first and made the second a best practice. > But then there are also the following proposals: > > Type Unit Merge > http://dwarfstd.org/ShowIssue.php?issue=130526.1 Yes, this is pretty much why we dropped the first proposal. > Ambiguity in DWARF4 of debug_info_offset in .debug_aranges > http://dwarfstd.org/ShowIssue.php?issue=100816.1 I don't see how this one is relevant. This was really just an editorial change -- .debug_aranges physically could not point to a .debug_types section. In addition, with the type unit merge proposal, it becomes completely irrelevant (there will be no .debug_types section at all). > One way might be to reverse the last proposal. Instead of removing the > aranges for type units (which did indeed not make much sense in the > split .debug_info/.debug_type approach), add an empty aranges header if > a type unit appears in .debug_info in the way of the second proposal for > address-less CUs. I don't really like the idea of having aranges sections for type units. It would be nicer to keep type units separate, since there are *so* many more of them than there are compunits, but for the accelerator table proposal, we needed a unified address space for the two, and this was the simplest way to accomplish that. I think it's fine for a consumer to first assume that the .debug_aranges table is complete, but if an address lookup fails, then it can scan the .debug_info section, hopping from one CU/TU to the next, looking for CUs that aren't covered by .debug_aranges tables. Having the debug_info length field in the aranges table would help, but even then, it's not clear to me how much it will help. With unified CUs and TUs, this scan will be costly with or without the length field -- for every CU, you'll probably have to skip over dozens or hundreds of TUs. It might help if we could guarantee that all CUs precede all TUs, but the only reasonable way to do that is to keep them in separate sections to begin with! Adding empty aranges tables for the TUs will bloat the .debug_aranges section significantly, and it would add a burden on the producers to make sure that the .debug_aranges contributions are in the same COMDAT group as the TU itself (GCC, for example, still doesn't put related sections in the same group). I think if we can come up with a way to have index entries in the accelerator tables point to either a CU or a TU without having a unified address space, we wouldn't need to merge .debug_info and .debug_types, and we could reopen the first proposal. -cary ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org