dwarf_aggregate_size doesn't work with arrays in partial CUs

2021-09-25 Thread KJ Tsanaktsidis via Elfutils-devel
Hi folks,

I'm writing a program that uses ptrace to poke at internal OpenSSL
data structures for another process. I'm using libdw to parse the
DWARF data for the copy of OpenSSL actually linked in to the target
process, so I can extract struct offsets, member sizes and the like
and poke at the right places.

I've run into an issue where dwarf_aggregate_size can't calculate the
size of an array, when the array is included in a partial CU
(DW_TAG_partial_unit). If the array unit includes a DW_AT_upper_bound
attribute, but not a DW_AT_lower_bound attribute, then
dwarf_aggregate_size will infer the lower bound based on the
DW_AT_language attribute of the enclisng CU (i.e. whether the language
uses zero or one based indexing).

However, the debug symbols I'm looking at for OpenSSL from the Ubuntu
repositories have the DW_AT_language on the full compilation unit
entries, but not in the partial ones included in them. This means that
caling dwarf_aggregate_size on the array type DIE does not work.

The DWARF spec doesn't really seem to have anything to say on the
matter (all it says is "A full or partial compilation unit entry may
have the following attributes", but doesn't say what it logically
means if an attribute is present on the complete CU but not a partial
one).

I guess it doesn't really make sense for a single compilation unit to
contain multiple languages? So I wonder if dwarf_srclang (called by
dwarf_aggregate_size) should crawl through the list of CU's to see if
the DIE's CU is included in a CU that _does_ specify DW_AT_language
(recursively, I suppose). Then, we can infer that the partial CU's
language is the same as the enclosing one.

If people reckon this is a good idea (or, have a better one!), I'm
happy to try and put together a patch.

KJ



Re: dwarf_aggregate_size doesn't work with arrays in partial CUs

2021-10-02 Thread KJ Tsanaktsidis via Elfutils-devel
On Thu, Sep 30, 2021 at 12:27 AM Mark Wielaard  wrote:
>
> Hi KJ,
>
> On Sat, 2021-09-25 at 17:21 +1000, KJ Tsanaktsidis via Elfutils-devel
> wrote:
> > I'm writing a program that uses ptrace to poke at internal OpenSSL
> > data structures for another process. I'm using libdw to parse the
> > DWARF data for the copy of OpenSSL actually linked in to the target
> > process, so I can extract struct offsets, member sizes and the like
> > and poke at the right places.
> >
> > I've run into an issue where dwarf_aggregate_size can't calculate the
> > size of an array, when the array is included in a partial CU
> > (DW_TAG_partial_unit). If the array unit includes a DW_AT_upper_bound
> > attribute, but not a DW_AT_lower_bound attribute, then
> > dwarf_aggregate_size will infer the lower bound based on the
> > DW_AT_language attribute of the enclisng CU (i.e. whether the language
> > uses zero or one based indexing).
> >
> > However, the debug symbols I'm looking at for OpenSSL from the Ubuntu
> > repositories have the DW_AT_language on the full compilation unit
> > entries, but not in the partial ones included in them. This means that
> > caling dwarf_aggregate_size on the array type DIE does not work.
>
> That is indeed a problem, since dwarf_aggregate_size doesn't provide
> another way to provide the language to use for the
> dwarf_default_lower_bound call. And the default is to return an
> DWARF_E_UNKNOWN_LANGUAGE error.
>
> Maybe we should change the default to assume the lower bound is zero?
>
> > The DWARF spec doesn't really seem to have anything to say on the
> > matter (all it says is "A full or partial compilation unit entry may
> > have the following attributes", but doesn't say what it logically
> > means if an attribute is present on the complete CU but not a partial
> > one).
>
> I think it is assumed that it inherits those attributes from the CU
> from which the partial one was imported and/or from the CU of the DIE
> that referenced the DIE in the partial unit. But I don't think it is
> easy to track that with libdw currently.
>
> > I guess it doesn't really make sense for a single compilation unit to
> > contain multiple languages? So I wonder if dwarf_srclang (called by
> > dwarf_aggregate_size) should crawl through the list of CU's to see if
> > the DIE's CU is included in a CU that _does_ specify DW_AT_language
> > (recursively, I suppose). Then, we can infer that the partial CU's
> > language is the same as the enclosing one.
> >
> > If people reckon this is a good idea (or, have a better one!), I'm
> > happy to try and put together a patch.
>
> I think that suggestion is sound, but really expensive. It also is
> somewhat tricky if you have alt files, you'll have to track back to the
> original Dwarf to see if it imports one of the partial units from the
> alt file.
>
> But I also don't have a good alternative idea. We could maybe have a
> variant of dwarf_aggregate_size that takes a language default value,
> but that doesn't seem like a very generic solution. Or maybe a variant
> of dwarf_srclang that takes any DIE (not just a CU DIE) and which tries
> to figure out the best language to use, which falls back to some
> default value if it cannot figure out what the language is that can be
> used with dwarf_default_lower_bound to get a default (most likely
> zero)?
>
> We could also ask producers (like dwz) to always include a
> DW_AT_language for partial units they create. But that of course makes
> the partial units bigger (and at least dwz creates them to make the
> full debuginfo smaller).
>
> Cheers,
>
> Mark
>

I guess we don't want to hide some really expensive traversal
operation inside a simple call to dwarf_aggregate_size, no...

What if we instead provide a way for the user to specify what language
a CU is? Like "dwarf_cu_report_language(Dwarf_Die *cu, int lang)".
That would get saved with the (partial) CU, and dwarf_srclang could
retrieve this information (if DW_AT_language isn't set). Then, the
user could recursively traverse all CUs and call
dwarf_cu_report_language on each partial CU. And as a bonus, we could
even wrap that up in dwarf_cu_traverse_partial_cu_set_language or
something (OK, the name needs a bit of workshopping).

That way, the expensive thing is in a separate call that's marked as
being very expensive (and cached, so it only needs to be done once).
Sound like a reasonable approach?