[Dwarf-Discuss] qualifier modifier type tags vs type signatures

2014-09-25 Thread Mark Wielaard
Hi,

This came up on the gcc list when extending the number of DWARF type
qualifier modifiers that are handled. But the issue can be shown with
just const and volatile.

The issue is that there is no ordering constraint on the type qualifier
modifier tags (they can appear in any order), but the type signature
computation for type units depends on an flattened ordered DIE tree to
get the same signature for the same type from different compile units.

This seems to only cause a missed opportunity of optimizing when using
type units (the type cannot be merged). But maybe there are more subtle
issues if the same type from different compilation units create
different type unit signatures?

Take for example these two compile units, which both define the same
type struct s, but also have some different types:

1)

int a;
const int b;
volatile const c;

struct s
{
  const volatile int i;
} s;

Here GCC would create a const type that points to an int and a volatile
type that points to that const type. When constructing the DWARF
representation of the struct s, it sees it already has a type DIE for
const volatile int and reuses that. So in this case the DIE chain for
the s.i type comes out as:
DW_TAG_volatile_type -> DW_TAG_const_type -> DW_TAG_base_type

2)

int a;
volatile int b;
volatile const c;

struct s
{
  const volatile int i;
} s;

Here GCC would create a volatile type that points to an int and a const
type that points to that volatile type. When constructing the DWARF
representation of the struct s, it again sees it already has a type DIE
for const volatile int and reuses that. So in this case the DIE chain
for the s.i type comes out as:
DW_TAG_const_type -> DW_TAG_volatile_type -> DW_TAG_base_type

The result is that the flattened description of struct s as used by the
type signature computation is different in these two compile units. And
so the MD5 hash used as signature will differ.

I think what GCC does when constructing the DIE type trees is correct.
It creates the most efficient type tree in both compile units. To create
type trees that look the same in flattened form in both compile units it
would have to add extra type trees (a volatile -> int in one or a const
-> int in the other) that aren't used otherwise. And it cannot really
know which ordering is preferred across all compilation units up front.

But it is somewhat unfortunate that it causes the type units to come out
with different signatures meaning they cannot be merged.

What would be the best way to solve this? I see a couple of options:

0) Don't change anything. It is just a missed optimization.

1) DWARF could prescribe an ordering to use when multiple qualifiers
type tags are in used. This might cause producers to create some extra
type trees if different subsets of type qualifiers are used, but if type
units are used, it might lead to a couple more types to share the same
signature.

2) The Type Signature Computation could be changed to understand that
type qualifier tags pointing to each other need to be sorted first. This
makes the algorithm a little trickier, but means less different type
trees in the compile units.

3) DWARF(v5) could deprecate nested type qualifier modifiers as separate
tags and replace them with one DW_TAG_qualified_type tag with either
separate DW_AT_const|volatile|restrict|atomic flag attributes or a
DW_AT_qualifiers attribute that indicates the combined qualifiers
(const, volatile, restrict, atomic). That removes the whole ordering
issues, so it doesn't matter in which order the DIE chain is flattened.

Ideas and/or opinions?

Thanks,

Mark
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] qualifier modifier type tags vs type signatures

2014-09-25 Thread Michael Eager

On 09/25/14 08:18, Mark Wielaard wrote:

Hi,

This came up on the gcc list when extending the number of DWARF type
qualifier modifiers that are handled. But the issue can be shown with
just const and volatile.

The issue is that there is no ordering constraint on the type qualifier
modifier tags (they can appear in any order), but the type signature
computation for type units depends on an flattened ordered DIE tree to
get the same signature for the same type from different compile units.


The C/C++ languages do not place any ordering restrictions on these
qualifiers (or on a number of other qualifiers).

A DWARF producer is free to generate DWARF in any fashion which
accurately describes the source and compilation process.  If you want
to adopt a 'const' before 'volatile' convention (alphabetical) you
are welcome to do so, but there is no requirement to do this.


This seems to only cause a missed opportunity of optimizing when using
type units (the type cannot be merged). But maybe there are more subtle
issues if the same type from different compilation units create
different type unit signatures?

Take for example these two compile units, which both define the same
type struct s, but also have some different types:

1)

int a;
const int b;
volatile const c;

struct s
{
   const volatile int i;
} s;

Here GCC would create a const type that points to an int and a volatile
type that points to that const type. When constructing the DWARF
representation of the struct s, it sees it already has a type DIE for
const volatile int and reuses that. So in this case the DIE chain for
the s.i type comes out as:
DW_TAG_volatile_type -> DW_TAG_const_type -> DW_TAG_base_type

2)

int a;
volatile int b;
volatile const c;

struct s
{
   const volatile int i;
} s;

Here GCC would create a volatile type that points to an int and a const
type that points to that volatile type. When constructing the DWARF
representation of the struct s, it again sees it already has a type DIE
for const volatile int and reuses that. So in this case the DIE chain
for the s.i type comes out as:
DW_TAG_const_type -> DW_TAG_volatile_type -> DW_TAG_base_type

The result is that the flattened description of struct s as used by the
type signature computation is different in these two compile units. And
so the MD5 hash used as signature will differ.

I think what GCC does when constructing the DIE type trees is correct.
It creates the most efficient type tree in both compile units. To create
type trees that look the same in flattened form in both compile units it
would have to add extra type trees (a volatile -> int in one or a const
-> int in the other) that aren't used otherwise. And it cannot really
know which ordering is preferred across all compilation units up front.


Alternately, GCC could have a set of pre-computed DIE type trees in
preferred order and use a matching equivalent when it generates DWARF.


But it is somewhat unfortunate that it causes the type units to come out
with different signatures meaning they cannot be merged.


This seems to be a Quality of Implementation issue.  If a producer
takes a naive approach to generating DWARF type trees, then the results
(in this case, ability to merge debug data) will be worse than if it
takes a more sophisticated approach.


What would be the best way to solve this? I see a couple of options:

0) Don't change anything. It is just a missed optimization.


Yes, it's QoI.


1) DWARF could prescribe an ordering to use when multiple qualifiers
type tags are in used. This might cause producers to create some extra
type trees if different subsets of type qualifiers are used, but if type
units are used, it might lead to a couple more types to share the same
signature.


DWARF is descriptive, not prescriptive.

Both DW_TAG_const_type -> DW_TAG_volatile_type -> DW_TAG_base_type and
DW_TAG_volatile_type -> DW_TAG_const_type -> DW_TAG_base_type are valid
descriptions.

That these both describe the same type in C/C++ is a language issue.
Conceivably there might be a language which does not have this same type
equivalence.


2) The Type Signature Computation could be changed to understand that
type qualifier tags pointing to each other need to be sorted first. This
makes the algorithm a little trickier, but means less different type
trees in the compile units.


This would mean embedding an understanding of C/C++ type equivalence
into the type signature computation.  DWARF is intended to be language-
neutral, and, in particular, avoid hidden language dependencies.


3) DWARF(v5) could deprecate nested type qualifier modifiers as separate
tags and replace them with one DW_TAG_qualified_type tag with either
separate DW_AT_const|volatile|restrict|atomic flag attributes or a
DW_AT_qualifiers attribute that indicates the combined qualifiers
(const, volatile, restrict, atomic). That removes the whole ordering
issues, so it doesn't matter in which order the DIE chain is flattened.


The deadline for DWARF Vers

Re: [Dwarf-Discuss] qualifier modifier type tags vs type signatures

2014-09-25 Thread John DelSignore
Hi Mark,

IMHO, the compiler should be picking a canonical order for the
qualifiers in the DWARF, just like it does for name mangling:

% cat xxx.cxx
void fpcvi(const volatile int*){}
void fpvci(volatile const int*){}
% g++ -g -c xxx.cxx
% nm -C xxx.o
 T fpcvi(int const volatile*)
000a T fpvci(int const volatile*)
 U __gxx_personality_v0
% nm xxx.o
 T _Z5fpcviPVKi
000a T _Z5fpvciPVKi
 U __gxx_personality_v0
%

The compiler (GCC 4.1.2) canonicalized the qualifier order in the
overload portion of the mangled name to "PVKi", which is "pointer-to
volatile const int". I think the demangler just spit the type out
right-to-left.

Here's the DWARF, with irrelevant lines deleted:

% readelf -wi xxx.o
<1><62>: Abbrev Number: 2 (DW_TAG_subprogram)
 DW_AT_name: fpcvi
 DW_AT_MIPS_linkage_name: _Z5fpcviPVKi
<2><91>: Abbrev Number: 3 (DW_TAG_formal_parameter)
 DW_AT_type: <9a>
<1><9a>: Abbrev Number: 4 (DW_TAG_pointer_type)
 DW_AT_type: 
<1>: Abbrev Number: 5 (DW_TAG_const_type)
 DW_AT_type: 
<1>: Abbrev Number: 6 (DW_TAG_volatile_type)
 DW_AT_type: 
<1>: Abbrev Number: 7 (DW_TAG_base_type)
 DW_AT_name: int
<1>: Abbrev Number: 8 (DW_TAG_subprogram)
 DW_AT_name: fpvci
 DW_AT_MIPS_linkage_name: _Z5fpvciPVKi
<2>: Abbrev Number: 3 (DW_TAG_formal_parameter)
 DW_AT_type: <9a>
%

So, in the DWARF, it change the type to "pointer-to const volatile int",
and used the same type for both parameters. It didn't matter what order
I defined the functions, the compiler did the same thing. I think all
this rearranging is perfectly OK.

Given that, your option "0" along seems like the best choice to me. If
someone really wants to squeeze out every last bit of compression, I
think gcc could be changed to canonicalize the order of const/volatile
in your example, and no one would (should) care about the rearrangement
of the qualifiers.

Cheers, John D.

Mark Wielaard wrote:
> Hi,
>
> This came up on the gcc list when extending the number of DWARF type
> qualifier modifiers that are handled. But the issue can be shown with
> just const and volatile.
>
> The issue is that there is no ordering constraint on the type qualifier
> modifier tags (they can appear in any order), but the type signature
> computation for type units depends on an flattened ordered DIE tree to
> get the same signature for the same type from different compile units.
>
> This seems to only cause a missed opportunity of optimizing when using
> type units (the type cannot be merged). But maybe there are more subtle
> issues if the same type from different compilation units create
> different type unit signatures?
>
> Take for example these two compile units, which both define the same
> type struct s, but also have some different types:
>
> 1)
>
> int a;
> const int b;
> volatile const c;
>
> struct s
> {
>   const volatile int i;
> } s;
>
> Here GCC would create a const type that points to an int and a volatile
> type that points to that const type. When constructing the DWARF
> representation of the struct s, it sees it already has a type DIE for
> const volatile int and reuses that. So in this case the DIE chain for
> the s.i type comes out as:
> DW_TAG_volatile_type -> DW_TAG_const_type -> DW_TAG_base_type
>
> 2)
>
> int a;
> volatile int b;
> volatile const c;
>
> struct s
> {
>   const volatile int i;
> } s;
>
> Here GCC would create a volatile type that points to an int and a const
> type that points to that volatile type. When constructing the DWARF
> representation of the struct s, it again sees it already has a type DIE
> for const volatile int and reuses that. So in this case the DIE chain
> for the s.i type comes out as:
> DW_TAG_const_type -> DW_TAG_volatile_type -> DW_TAG_base_type
>
> The result is that the flattened description of struct s as used by the
> type signature computation is different in these two compile units. And
> so the MD5 hash used as signature will differ.
>
> I think what GCC does when constructing the DIE type trees is correct.
> It creates the most efficient type tree in both compile units. To create
> type trees that look the same in flattened form in both compile units it
> would have to add extra type trees (a volatile -> int in one or a const
> -> int in the other) that aren't used otherwise. And it cannot really
> know which ordering is preferred across all compilation units up front.
>
> But it is somewhat unfortunate that it causes the type units to come out
> with different signatures meaning they cannot be merged.
>
> What would be the best way to solve this? I see a couple of options:
>
> 0) Don't change anything. It is just a missed optimization.
>
> 1) DWARF could prescribe an ordering to use when multiple qualifiers
> type tags are in used. This might cause producers to create some extra
> type trees if different subsets of type qualifie