[Dwarf-Discuss] qualifier modifier type tags vs type signatures
Hi, This came up on the gcc list when extending the number of DWARF type qualifier modifiers that are handled. But the issue can be shown with just const and volatile. The issue is that there is no ordering constraint on the type qualifier modifier tags (they can appear in any order), but the type signature computation for type units depends on an flattened ordered DIE tree to get the same signature for the same type from different compile units. This seems to only cause a missed opportunity of optimizing when using type units (the type cannot be merged). But maybe there are more subtle issues if the same type from different compilation units create different type unit signatures? Take for example these two compile units, which both define the same type struct s, but also have some different types: 1) int a; const int b; volatile const c; struct s { const volatile int i; } s; Here GCC would create a const type that points to an int and a volatile type that points to that const type. When constructing the DWARF representation of the struct s, it sees it already has a type DIE for const volatile int and reuses that. So in this case the DIE chain for the s.i type comes out as: DW_TAG_volatile_type -> DW_TAG_const_type -> DW_TAG_base_type 2) int a; volatile int b; volatile const c; struct s { const volatile int i; } s; Here GCC would create a volatile type that points to an int and a const type that points to that volatile type. When constructing the DWARF representation of the struct s, it again sees it already has a type DIE for const volatile int and reuses that. So in this case the DIE chain for the s.i type comes out as: DW_TAG_const_type -> DW_TAG_volatile_type -> DW_TAG_base_type The result is that the flattened description of struct s as used by the type signature computation is different in these two compile units. And so the MD5 hash used as signature will differ. I think what GCC does when constructing the DIE type trees is correct. It creates the most efficient type tree in both compile units. To create type trees that look the same in flattened form in both compile units it would have to add extra type trees (a volatile -> int in one or a const -> int in the other) that aren't used otherwise. And it cannot really know which ordering is preferred across all compilation units up front. But it is somewhat unfortunate that it causes the type units to come out with different signatures meaning they cannot be merged. What would be the best way to solve this? I see a couple of options: 0) Don't change anything. It is just a missed optimization. 1) DWARF could prescribe an ordering to use when multiple qualifiers type tags are in used. This might cause producers to create some extra type trees if different subsets of type qualifiers are used, but if type units are used, it might lead to a couple more types to share the same signature. 2) The Type Signature Computation could be changed to understand that type qualifier tags pointing to each other need to be sorted first. This makes the algorithm a little trickier, but means less different type trees in the compile units. 3) DWARF(v5) could deprecate nested type qualifier modifiers as separate tags and replace them with one DW_TAG_qualified_type tag with either separate DW_AT_const|volatile|restrict|atomic flag attributes or a DW_AT_qualifiers attribute that indicates the combined qualifiers (const, volatile, restrict, atomic). That removes the whole ordering issues, so it doesn't matter in which order the DIE chain is flattened. Ideas and/or opinions? Thanks, Mark ___ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
Re: [Dwarf-Discuss] qualifier modifier type tags vs type signatures
On 09/25/14 08:18, Mark Wielaard wrote: Hi, This came up on the gcc list when extending the number of DWARF type qualifier modifiers that are handled. But the issue can be shown with just const and volatile. The issue is that there is no ordering constraint on the type qualifier modifier tags (they can appear in any order), but the type signature computation for type units depends on an flattened ordered DIE tree to get the same signature for the same type from different compile units. The C/C++ languages do not place any ordering restrictions on these qualifiers (or on a number of other qualifiers). A DWARF producer is free to generate DWARF in any fashion which accurately describes the source and compilation process. If you want to adopt a 'const' before 'volatile' convention (alphabetical) you are welcome to do so, but there is no requirement to do this. This seems to only cause a missed opportunity of optimizing when using type units (the type cannot be merged). But maybe there are more subtle issues if the same type from different compilation units create different type unit signatures? Take for example these two compile units, which both define the same type struct s, but also have some different types: 1) int a; const int b; volatile const c; struct s { const volatile int i; } s; Here GCC would create a const type that points to an int and a volatile type that points to that const type. When constructing the DWARF representation of the struct s, it sees it already has a type DIE for const volatile int and reuses that. So in this case the DIE chain for the s.i type comes out as: DW_TAG_volatile_type -> DW_TAG_const_type -> DW_TAG_base_type 2) int a; volatile int b; volatile const c; struct s { const volatile int i; } s; Here GCC would create a volatile type that points to an int and a const type that points to that volatile type. When constructing the DWARF representation of the struct s, it again sees it already has a type DIE for const volatile int and reuses that. So in this case the DIE chain for the s.i type comes out as: DW_TAG_const_type -> DW_TAG_volatile_type -> DW_TAG_base_type The result is that the flattened description of struct s as used by the type signature computation is different in these two compile units. And so the MD5 hash used as signature will differ. I think what GCC does when constructing the DIE type trees is correct. It creates the most efficient type tree in both compile units. To create type trees that look the same in flattened form in both compile units it would have to add extra type trees (a volatile -> int in one or a const -> int in the other) that aren't used otherwise. And it cannot really know which ordering is preferred across all compilation units up front. Alternately, GCC could have a set of pre-computed DIE type trees in preferred order and use a matching equivalent when it generates DWARF. But it is somewhat unfortunate that it causes the type units to come out with different signatures meaning they cannot be merged. This seems to be a Quality of Implementation issue. If a producer takes a naive approach to generating DWARF type trees, then the results (in this case, ability to merge debug data) will be worse than if it takes a more sophisticated approach. What would be the best way to solve this? I see a couple of options: 0) Don't change anything. It is just a missed optimization. Yes, it's QoI. 1) DWARF could prescribe an ordering to use when multiple qualifiers type tags are in used. This might cause producers to create some extra type trees if different subsets of type qualifiers are used, but if type units are used, it might lead to a couple more types to share the same signature. DWARF is descriptive, not prescriptive. Both DW_TAG_const_type -> DW_TAG_volatile_type -> DW_TAG_base_type and DW_TAG_volatile_type -> DW_TAG_const_type -> DW_TAG_base_type are valid descriptions. That these both describe the same type in C/C++ is a language issue. Conceivably there might be a language which does not have this same type equivalence. 2) The Type Signature Computation could be changed to understand that type qualifier tags pointing to each other need to be sorted first. This makes the algorithm a little trickier, but means less different type trees in the compile units. This would mean embedding an understanding of C/C++ type equivalence into the type signature computation. DWARF is intended to be language- neutral, and, in particular, avoid hidden language dependencies. 3) DWARF(v5) could deprecate nested type qualifier modifiers as separate tags and replace them with one DW_TAG_qualified_type tag with either separate DW_AT_const|volatile|restrict|atomic flag attributes or a DW_AT_qualifiers attribute that indicates the combined qualifiers (const, volatile, restrict, atomic). That removes the whole ordering issues, so it doesn't matter in which order the DIE chain is flattened. The deadline for DWARF Vers
Re: [Dwarf-Discuss] qualifier modifier type tags vs type signatures
Hi Mark, IMHO, the compiler should be picking a canonical order for the qualifiers in the DWARF, just like it does for name mangling: % cat xxx.cxx void fpcvi(const volatile int*){} void fpvci(volatile const int*){} % g++ -g -c xxx.cxx % nm -C xxx.o T fpcvi(int const volatile*) 000a T fpvci(int const volatile*) U __gxx_personality_v0 % nm xxx.o T _Z5fpcviPVKi 000a T _Z5fpvciPVKi U __gxx_personality_v0 % The compiler (GCC 4.1.2) canonicalized the qualifier order in the overload portion of the mangled name to "PVKi", which is "pointer-to volatile const int". I think the demangler just spit the type out right-to-left. Here's the DWARF, with irrelevant lines deleted: % readelf -wi xxx.o <1><62>: Abbrev Number: 2 (DW_TAG_subprogram) DW_AT_name: fpcvi DW_AT_MIPS_linkage_name: _Z5fpcviPVKi <2><91>: Abbrev Number: 3 (DW_TAG_formal_parameter) DW_AT_type: <9a> <1><9a>: Abbrev Number: 4 (DW_TAG_pointer_type) DW_AT_type: <1>: Abbrev Number: 5 (DW_TAG_const_type) DW_AT_type: <1>: Abbrev Number: 6 (DW_TAG_volatile_type) DW_AT_type: <1>: Abbrev Number: 7 (DW_TAG_base_type) DW_AT_name: int <1>: Abbrev Number: 8 (DW_TAG_subprogram) DW_AT_name: fpvci DW_AT_MIPS_linkage_name: _Z5fpvciPVKi <2>: Abbrev Number: 3 (DW_TAG_formal_parameter) DW_AT_type: <9a> % So, in the DWARF, it change the type to "pointer-to const volatile int", and used the same type for both parameters. It didn't matter what order I defined the functions, the compiler did the same thing. I think all this rearranging is perfectly OK. Given that, your option "0" along seems like the best choice to me. If someone really wants to squeeze out every last bit of compression, I think gcc could be changed to canonicalize the order of const/volatile in your example, and no one would (should) care about the rearrangement of the qualifiers. Cheers, John D. Mark Wielaard wrote: > Hi, > > This came up on the gcc list when extending the number of DWARF type > qualifier modifiers that are handled. But the issue can be shown with > just const and volatile. > > The issue is that there is no ordering constraint on the type qualifier > modifier tags (they can appear in any order), but the type signature > computation for type units depends on an flattened ordered DIE tree to > get the same signature for the same type from different compile units. > > This seems to only cause a missed opportunity of optimizing when using > type units (the type cannot be merged). But maybe there are more subtle > issues if the same type from different compilation units create > different type unit signatures? > > Take for example these two compile units, which both define the same > type struct s, but also have some different types: > > 1) > > int a; > const int b; > volatile const c; > > struct s > { > const volatile int i; > } s; > > Here GCC would create a const type that points to an int and a volatile > type that points to that const type. When constructing the DWARF > representation of the struct s, it sees it already has a type DIE for > const volatile int and reuses that. So in this case the DIE chain for > the s.i type comes out as: > DW_TAG_volatile_type -> DW_TAG_const_type -> DW_TAG_base_type > > 2) > > int a; > volatile int b; > volatile const c; > > struct s > { > const volatile int i; > } s; > > Here GCC would create a volatile type that points to an int and a const > type that points to that volatile type. When constructing the DWARF > representation of the struct s, it again sees it already has a type DIE > for const volatile int and reuses that. So in this case the DIE chain > for the s.i type comes out as: > DW_TAG_const_type -> DW_TAG_volatile_type -> DW_TAG_base_type > > The result is that the flattened description of struct s as used by the > type signature computation is different in these two compile units. And > so the MD5 hash used as signature will differ. > > I think what GCC does when constructing the DIE type trees is correct. > It creates the most efficient type tree in both compile units. To create > type trees that look the same in flattened form in both compile units it > would have to add extra type trees (a volatile -> int in one or a const > -> int in the other) that aren't used otherwise. And it cannot really > know which ordering is preferred across all compilation units up front. > > But it is somewhat unfortunate that it causes the type units to come out > with different signatures meaning they cannot be merged. > > What would be the best way to solve this? I see a couple of options: > > 0) Don't change anything. It is just a missed optimization. > > 1) DWARF could prescribe an ordering to use when multiple qualifiers > type tags are in used. This might cause producers to create some extra > type trees if different subsets of type qualifie