Re: [Dwarf-Discuss] Default Location List Entry Issue 130121.1

2014-04-03 Thread Mark Wielaard
On Wed, 2014-04-02 at 08:35 -0700, Michael Eager wrote:
> On 04/02/14 03:43, Mark Wielaard wrote:
> > On Tue, 2014-04-01 at 18:42 -0700, Michael Eager wrote:
> >> On 04/01/14 13:54, Mark Wielaard wrote:
> >>
> >>> What about using the presence of a DW_AT_external attribute on the data
> >>> object that has a single location expression to know whether the described
> >>> location is valid/visible outside of the enclosing lexical scope?
> >>>
> >>> Using that or some new flag (DW_AT_global_scope) to mark a data object
> >>> that has a single location description with global scope might be cheaper
> >>> than encoding it with a location list pointer and a default entry.
> >>
> >> Both DW_AT_external attribute and a hypothetical DW_AT_global_scope
> >> attribute would describe scope, not the storage life of the object.
> >> C unfortunately has confabulated the two concepts.
> >
> > In practice I believe DW_AT_external does both and I have used it that
> > way to know whether to trust a single location description is globally
> > valid. But yes, just like lifetime both "external" and "scope" are again
> > bad words to use in this case and don't really express the thing we want
> > (they might only accidentally).
> 
> Section 3.3.1 describes the DW_AT_external with the following:
> 
>If the name of the subroutine described by an entry with the tag
>DW_TAG_subprogram is visible outside of its containing compilation
>unit, that entry has a DW_AT_external attribute, which is a flag.
> 
> Section 4.1 contains the following (describing an object):
> 
>2. A DW_AT_external attribute, which is a flag, if the name of a
>variable is visible outside of its enclosing compilation unit.
> 
> There is nothing about the lifetime of any object.  Confabulating that
> the scope attributes in DWARF imply something different is not supported
> by the DWARF standard and will lead to misinterpretation of the DWARF 
> standard.
>
> > What about we call the flag attribute DW_AT_global_visible or
> > DW_AT_global_location then it would be clear I think because it signals
> > this isn't about language or DWARF tree (lexical) scoping?
> 
> Symbol visibility is adequately described by the DW_AT_external attribute
> as defined above.  Object locations are adequately described by location
> expressions and location lists.

I cannot tell whether we are in violent agreement or disagreement.
Probably both :) I guess the confusion comes from some misunderstanding
about assumptions of terminology used.

As was said before the term lifetime as used in the description of
location descriptors is confusing. Since DWARF doesn't talk about
lifetime of objects anywhere else. What I seem to miss is how to express
for a data object the address ranges where a a location description is
valid for single location descriptions. I think it would really help if
the use of "lifetime" in the definition of location descriptors was
dropped and clarified by a definition based on valid address ranges.

But maybe I am just misunderstanding something by reading too much or
too little in this definition. I'll ask for a more specific example
below so we don't have to argue about the specifics of words/definitions
but can just agree on what a producer should emit for an consumer to
know which ranges are valid.

> >> Objects which have only a single location can be described with a location
> >> expression.  They don't need a location list with a default entry.
> >
> > For a data object that has a single location descriptor (DW_AT_location
> > in exprloc class form), the valid range is given by the address ranges
> > of the DIEs that own the data object. So it only lets me express the
> > location in a restricted range. I do need to use a location list with
> > (just) a default entry if I want to indicate that the location
> > description has a valid global range.
> 
> No, this interpretation is incorrect and is not supported by reading the
> DWARF Standard.  Please see Section 2.6.
> 
> Perhaps you are confused by the following from Section 2.6 (which I think
> is unambiguous):
> 
> 1. Single location descriptions, which are a language independent
>representation of addressing rules of arbitrary complexity built
>from DWARF expressions and/or other DWARF operations specific to
>describing locations. They are sufficient for describing the location
>of any object as long as its lifetime is either static or the same as
>the lexical block that owns it, and it does not move during its 
> lifetime.
> 
> You seem to have interpreted "either/or" in the second sentence to mean "and".

You are correct that I am confused about this definition. Not because of
the either/or but about how to express the choices in DWARF. I don't
understand how for a DWARF Data Object DIE I express whether a) its
lifetime is static or b) the lifetime is the same as the lexical block
that owns it.

I assume that a "static lifetime" means the si

Re: [Dwarf-Discuss] Interaction between aranges and unit proposals

2014-04-03 Thread Mark Wielaard
On Wed, 2014-04-02 at 10:21 -0700, Cary Coutant wrote:
> > To make it possible to quickly see whether an address (range) is covered
> > by an ELF file containing DWARF information two proposals were made:
> >
> > aranges does not have debug info length
> > http://dwarfstd.org/ShowIssue.php?issue=100430.1
> >
> > debug_aranges and address-less CUs
> > http://dwarfstd.org/ShowIssue.php?issue=100430.2
> 
> We dropped the first and made the second a best practice.
> 
> > But then there are also the following proposals:
> >
> > Type Unit Merge
> > http://dwarfstd.org/ShowIssue.php?issue=130526.1
> 
> Yes, this is pretty much why we dropped the first proposal.

It isn't marked as dropped BTW. The status is open with you as champion.

> > Ambiguity in DWARF4 of debug_info_offset in .debug_aranges
> > http://dwarfstd.org/ShowIssue.php?issue=100816.1
> 
> I don't see how this one is relevant. This was really just an
> editorial change -- .debug_aranges physically could not point to a
> .debug_types section. In addition, with the type unit merge proposal,
> it becomes completely irrelevant (there will be no .debug_types
> section at all).

It is only relevant because it was a link between .debug_aranges and the
type units. If type units move to .debug_info then having them linked
like other units is what is needed to make aranges useful/complete. I
just mention it because looking at the individual proposals it might be
hard to see how an editorial change like that might impact other
proposals when you combine them.

> > One way might be to reverse the last proposal. Instead of removing the
> > aranges for type units (which did indeed not make much sense in the
> > split .debug_info/.debug_type approach), add an empty aranges header if
> > a type unit appears in .debug_info in the way of the second proposal for
> > address-less CUs.
> 
> I don't really like the idea of having aranges sections for type
> units. It would be nicer to keep type units separate, since there are
> *so* many more of them than there are compunits

Agreed. The same holds for partial units, which are often used in the
same way (though they aren't constrained to hold only type DIEs).

> , but for the
> accelerator table proposal, we needed a unified address space for the
> two, and this was the simplest way to accomplish that.

I might have missed that discussion.

> I think it's fine for a consumer to first assume that the
> .debug_aranges table is complete, but if an address lookup fails, then
> it can scan the .debug_info section, hopping from one CU/TU to the
> next, looking for CUs that aren't covered by .debug_aranges tables.

I think that is not very helpful. Unfortunately when introspecting or
debugging a program you often have "invalid" addresses, an uninitialized
pointer, a bad backtrace, a user might ask "does this register point to
a known function", etc. So the negative case will actually happen a lot,
meaning that you quickly would have to load and scan all .debug_info
anyway. And especially in a big program with lots of modules that can be
expensive.

> Having the debug_info length field in the aranges table would help,
> but even then, it's not clear to me how much it will help.

It only helps if producers follows best practices more often then not
and generate aranges headers for all units. Then you can at least use
the aranges headers to determine whether or not you really have to scan
the .debug_info to collect more ranges (and skip those you know are
already complete).

> With unified CUs and TUs, this scan will be costly with or without the
> length field -- for every CU, you'll probably have to skip over dozens
> or hundreds of TUs. It might help if we could guarantee that all CUs
> precede all TUs, but the only reasonable way to do that is to keep
> them in separate sections to begin with! Adding empty aranges tables
> for the TUs will bloat the .debug_aranges section significantly, and
> it would add a burden on the producers to make sure that the
> .debug_aranges contributions are in the same COMDAT group as the TU
> itself (GCC, for example, still doesn't put related sections in the
> same group).

Yes, all true. So you have to either generate (empty) arange headers for
all the TUs and PUs or move the no-ranges units somewhere else for the
consumer to be able to check the aranges table is complete.

Personally I think it makes sense to just mandate that producers make
sure the aranges headers are complete. Then you don't need to guess
whether there are some units not covered. Which means you don't need
producers to generate empty arange headers and you don't need the add
the debug info length to allow consumers to check whether all units are
covered. (Which is actually what we do in elfutils, we just
complain/file bugs against producers that don't generate aranges. GDB is
just a nicer consumer that tries to get things correct, but that does
mean a big memory/speed penalty at the moment unless both 100430.1 and
100430.2 

Re: [Dwarf-Discuss] Default Location List Entry Issue 130121.1

2014-04-03 Thread Robinson, Paul
> > A default location list entry (as proposed in 130121.1) gives the
> location
> > of an object for address values which are not otherwise specified in
> the
> > location list.
> 
> Maybe an example of this would be helpful too. I am under the (wrong)
> impression that a default location list entry would be used to describe
> "static objects". But the use case is probably different than I imagine.

Consider a subprogram with a local stack-allocated variable.  In the
simple case, a simple location description gives that location, and
of course it's valid for the address-range of the containing subprogram.

The compiler might optimize this variable into a register for part of
the subprogram, a different register for a different part of the
subprogram, and leave it on the stack otherwise.  This can be described
by a set of normal location list entries for the ranges where the
variable is in a register, and a default location entry for the entire
range of the subprogram, giving the stack location.  Of course the 
compiler *could* emit many individual entries with the stack location,
filling in the gaps, but the default location entry is more compact.

In any case, the locations for the local variable are valid only for
the address range of the containing subprogram.  Well, technically
only for the address range that has a valid stack frame.

--paulr


___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] Default Location List Entry Issue 130121.1

2014-04-03 Thread Michael Eager

On 04/03/14 01:51, Mark Wielaard wrote:

On Wed, 2014-04-02 at 08:35 -0700, Michael Eager wrote:

Perhaps you are confused by the following from Section 2.6 (which I think
is unambiguous):

 1. Single location descriptions, which are a language independent
representation of addressing rules of arbitrary complexity built
from DWARF expressions and/or other DWARF operations specific to
describing locations. They are sufficient for describing the location
of any object as long as its lifetime is either static or the same as
the lexical block that owns it, and it does not move during its 
lifetime.

You seem to have interpreted "either/or" in the second sentence to mean "and".


You are correct that I am confused about this definition. Not because of
the either/or but about how to express the choices in DWARF. I don't
understand how for a DWARF Data Object DIE I express whether a) its
lifetime is static or b) the lifetime is the same as the lexical block
that owns it.


Consider a lexical block or function in C which has the following:

A data object X which is defined as
  static int X;
would have a static lifetime.  It is created when the program is loaded
and is valid for the entire duration that the program runs.

Another data object Y is defined as
  register int Y;
would (most likely) be allocated to a register.  This object is created
when the lexical block or function is entered and is destroyed when the
function exits.  Its location would only be valid while the execution PC
is located within the range specified by the low_pc/high_pc of the enclosing
lexical block.

Both of these objects can be described as a simple location expression.
There may be other ways to describe these objects using location lists,
perhaps with a default, but there is no compelling reason to do this.

When you are interpreting the DWARF description for an object, you can
only determine whether a location expression is valid by interpreting it.
At different times during a program's execution, a location expression may
be valid and at other times it may be invalid.  If the location is dependent
on values which are dependent on the lexical block (e.g., a register value
which is only valid within the block, or an offset from the stack frame)
then the location expression is only valid if the execution PC is within
the lexical block (for a register value) or if the frame can be found by
walking the stack (for the offset case).  If the expression doesn't require
information dependent on the lexical block (i.e., the object has static
lifetime), then it is always valid.


I assume that a "static lifetime" means the single location description
of the Data Object DIE is valid for all address ranges. And I assume a
"lifetime that is the same as the lexical block that owns it" means that
the ranges for which the single location description is valid are the
same as the ranges of the owning Program Scope DIE of the Data Object
DIE.


Correct.

Although "lifetime of a object" is a term of art in computer science, I
think that it is well defined and understood.  We'll revisit using this
terminology when we edit the DWARF Version 5 standard prior to releasing
a public draft.


--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org