Re: [Dwarf-Discuss] Interaction between aranges and unit proposals

2014-04-02 Thread Mark Wielaard
Hi Eric,

On Tue, 2014-04-01 at 16:51 -0700, Eric Christopher wrote:
> On Tue, Apr 1, 2014 at 4:38 AM, Mark Wielaard  wrote:
> > Is there a way to reconcile these proposals so they keep the benefit of
> > both (quick/complete address scan without having to load/parse bulk data
> > and simplifying the DWARF data structures by combining various units in
> > one section)?
> >
> Absolutely a fan. Knowing what various consumers need is going to be
> key for any tables to speed up access.

So for the .debug_aranges table the two proposals try to make it
possible for a consumer to quickly create a table of address ranges that
describe which part of the .debug_info might be needed to read when an
address is encountered without having to actually read any of
the .debug_info/abbrev at all (if possible). There are two reasons this
currently cannot be done.

First producers often just skip generating an aranges entry for units
that don't cover any addresses, so you'll don't know whether it was just
not generated in the first place or really is empty. That is what issue
100430.2 tries to address, GCC was changed to follow this
recommendation.

Secondly you can sadly not be sure that all producers follow the
previous recommendation (it is deemed a quality of service matter
whether an aranges entry is generated for a CU) so if you have a module
that combined the output of various producers you need a way to check
they all really produced aranges entries for all the units. That is what
issue 100430.1 tries to address. By adding a unit length field like
other tables have you can just scan the aranges headers, check there are
no gaps of uncovered debug_info data and not have to even try to load
the .debug_info/.debug_abbrev data in that case. Of course if you do
find a gap you still need to read in and scan through all the unit data
itself, but at least you know you are doing it on purpose and only for
those modules that were generated by producers that don't generate
aranges for all units. GDB noticed this really matters for larger
programs with lots of modules, just having to map in all and scan
through the .debug sections you might not need creates a big (startup)
delay.

> > One way might be to reverse the last proposal. Instead of removing the
> > aranges for type units (which did indeed not make much sense in the
> > split .debug_info/.debug_type approach), add an empty aranges header if
> > a type unit appears in .debug_info in the way of the second proposal for
> > address-less CUs.
> >
> We could do this, but I think adding one for every type unit would be
> a bit wasteful. Since type units are going to have a flag in the
> header would it be possible for you to notice that when looking
> through the units? I'm not sure how you know that you have complete
> coverage so I'm just throwing out words here, could you provide a bit
> of a description of how this works for me if you don't mind?

You are right. It certainly is a trade-off. The goal is to not have to
read any of the unit data if at all possible. With the type units
separate in .debug_types that was easy.

Maybe the solution is to have an alternate .debug_aranges header just
for empty units that is as small as possible? Or reuse the existing
header fields as "flag"? Maybe have the proposed header format of issue
100430.1 but if address_size and segment_size are both zero then no
address range descriptor will be added and that headers signals a
"no-address" unit?

Cheers,

Mark

___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] Default Location List Entry Issue 130121.1

2014-04-02 Thread Mark Wielaard
On Tue, 2014-04-01 at 18:42 -0700, Michael Eager wrote:
> On 04/01/14 13:54, Mark Wielaard wrote:
> 
> > What about using the presence of a DW_AT_external attribute on the data
> > object that has a single location expression to know whether the described
> > location is valid/visible outside of the enclosing lexical scope?
> >
> > Using that or some new flag (DW_AT_global_scope) to mark a data object
> > that has a single location description with global scope might be cheaper
> > than encoding it with a location list pointer and a default entry.
> 
> Both DW_AT_external attribute and a hypothetical DW_AT_global_scope
> attribute would describe scope, not the storage life of the object.
> C unfortunately has confabulated the two concepts.

In practice I believe DW_AT_external does both and I have used it that
way to know whether to trust a single location description is globally
valid. But yes, just like lifetime both "external" and "scope" are again
bad words to use in this case and don't really express the thing we want
(they might only accidentally).

What about we call the flag attribute DW_AT_global_visible or
DW_AT_global_location then it would be clear I think because it signals
this isn't about language or DWARF tree (lexical) scoping?

> Objects which have only a single location can be described with a location
> expression.  They don't need a location list with a default entry.

For a data object that has a single location descriptor (DW_AT_location
in exprloc class form), the valid range is given by the address ranges
of the DIEs that own the data object. So it only lets me express the
location in a restricted range. I do need to use a location list with
(just) a default entry if I want to indicate that the location
description has a valid global range.

Is the above correct? Or is there another way to express that a location
description is globally valid? Or am I misunderstanding the purpose of
having a default entry in a location list?

Thanks,

Mark

___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] Interaction between aranges and unit proposals

2014-04-02 Thread Mark Wielaard
On Wed, 2014-04-02 at 12:18 +0200, Mark Wielaard wrote:
> Maybe the solution is to have an alternate .debug_aranges header just
> for empty units that is as small as possible? Or reuse the existing
> header fields as "flag"? Maybe have the proposed header format of issue
> 100430.1 but if address_size and segment_size are both zero then no
> address range descriptor will be added and that headers signals a
> "no-address" unit?

I forgot, there is another "solution". You could try to be not as
pedantically correct as GDB is following the DWARF standard. elfutils
tools like eu-addr2line and the libdwfl library functions to map
addresses to debug lines or DIEs for example just assume aranges isn't
an optional thing and that it will always be complete. That makes the
elfutils tools a lot faster than GDB, but obviously not as universal
(they just fail to match if no aranges are found). For this to work for
other tools however the indexes should be "upgraded" from quality of
service to mandatory.

Cheers,

Mark

___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] Default Location List Entry Issue 130121.1

2014-04-02 Thread Michael Eager

On 04/02/14 03:43, Mark Wielaard wrote:

On Tue, 2014-04-01 at 18:42 -0700, Michael Eager wrote:

On 04/01/14 13:54, Mark Wielaard wrote:


What about using the presence of a DW_AT_external attribute on the data
object that has a single location expression to know whether the described
location is valid/visible outside of the enclosing lexical scope?

Using that or some new flag (DW_AT_global_scope) to mark a data object
that has a single location description with global scope might be cheaper
than encoding it with a location list pointer and a default entry.


Both DW_AT_external attribute and a hypothetical DW_AT_global_scope
attribute would describe scope, not the storage life of the object.
C unfortunately has confabulated the two concepts.


In practice I believe DW_AT_external does both and I have used it that
way to know whether to trust a single location description is globally
valid. But yes, just like lifetime both "external" and "scope" are again
bad words to use in this case and don't really express the thing we want
(they might only accidentally).


Section 3.3.1 describes the DW_AT_external with the following:

  If the name of the subroutine described by an entry with the tag
  DW_TAG_subprogram is visible outside of its containing compilation
  unit, that entry has a DW_AT_external attribute, which is a flag.

Section 4.1 contains the following (describing an object):

  2. A DW_AT_external attribute, which is a flag, if the name of a
  variable is visible outside of its enclosing compilation unit.

There is nothing about the lifetime of any object.  Confabulating that
the scope attributes in DWARF imply something different is not supported
by the DWARF standard and will lead to misinterpretation of the DWARF standard.


What about we call the flag attribute DW_AT_global_visible or
DW_AT_global_location then it would be clear I think because it signals
this isn't about language or DWARF tree (lexical) scoping?


Symbol visibility is adequately described by the DW_AT_external attribute
as defined above.  Object locations are adequately described by location
expressions and location lists.


Objects which have only a single location can be described with a location
expression.  They don't need a location list with a default entry.


For a data object that has a single location descriptor (DW_AT_location
in exprloc class form), the valid range is given by the address ranges
of the DIEs that own the data object. So it only lets me express the
location in a restricted range. I do need to use a location list with
(just) a default entry if I want to indicate that the location
description has a valid global range.


No, this interpretation is incorrect and is not supported by reading the
DWARF Standard.  Please see Section 2.6.

Perhaps you are confused by the following from Section 2.6 (which I think
is unambiguous):

   1. Single location descriptions, which are a language independent
  representation of addressing rules of arbitrary complexity built
  from DWARF expressions and/or other DWARF operations specific to
  describing locations. They are sufficient for describing the location
  of any object as long as its lifetime is either static or the same as
  the lexical block that owns it, and it does not move during its lifetime.

You seem to have interpreted "either/or" in the second sentence to mean "and".


Is the above correct? Or is there another way to express that a location
description is globally valid? Or am I misunderstanding the purpose of
having a default entry in a location list?


You do not need a location list for a static object.  A simple location
expression will suffice.

A default location list entry (as proposed in 130121.1) gives the location
of an object for address values which are not otherwise specified in the
location list.

--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] Interaction between aranges and unit proposals

2014-04-02 Thread Cary Coutant
> To make it possible to quickly see whether an address (range) is covered
> by an ELF file containing DWARF information two proposals were made:
>
> aranges does not have debug info length
> http://dwarfstd.org/ShowIssue.php?issue=100430.1
>
> debug_aranges and address-less CUs
> http://dwarfstd.org/ShowIssue.php?issue=100430.2

We dropped the first and made the second a best practice.

> But then there are also the following proposals:
>
> Type Unit Merge
> http://dwarfstd.org/ShowIssue.php?issue=130526.1

Yes, this is pretty much why we dropped the first proposal.

> Ambiguity in DWARF4 of debug_info_offset in .debug_aranges
> http://dwarfstd.org/ShowIssue.php?issue=100816.1

I don't see how this one is relevant. This was really just an
editorial change -- .debug_aranges physically could not point to a
.debug_types section. In addition, with the type unit merge proposal,
it becomes completely irrelevant (there will be no .debug_types
section at all).

> One way might be to reverse the last proposal. Instead of removing the
> aranges for type units (which did indeed not make much sense in the
> split .debug_info/.debug_type approach), add an empty aranges header if
> a type unit appears in .debug_info in the way of the second proposal for
> address-less CUs.

I don't really like the idea of having aranges sections for type
units. It would be nicer to keep type units separate, since there are
*so* many more of them than there are compunits, but for the
accelerator table proposal, we needed a unified address space for the
two, and this was the simplest way to accomplish that.

I think it's fine for a consumer to first assume that the
.debug_aranges table is complete, but if an address lookup fails, then
it can scan the .debug_info section, hopping from one CU/TU to the
next, looking for CUs that aren't covered by .debug_aranges tables.
Having the debug_info length field in the aranges table would help,
but even then, it's not clear to me how much it will help.

With unified CUs and TUs, this scan will be costly with or without the
length field -- for every CU, you'll probably have to skip over dozens
or hundreds of TUs. It might help if we could guarantee that all CUs
precede all TUs, but the only reasonable way to do that is to keep
them in separate sections to begin with! Adding empty aranges tables
for the TUs will bloat the .debug_aranges section significantly, and
it would add a burden on the producers to make sure that the
.debug_aranges contributions are in the same COMDAT group as the TU
itself (GCC, for example, still doesn't put related sections in the
same group).

I think if we can come up with a way to have index entries in the
accelerator tables point to either a CU or a TU without having a
unified address space, we wouldn't need to merge .debug_info and
.debug_types, and we could reopen the first proposal.

-cary
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org