Hey Todd, Just some details regarding the string reduction strategies I'm pursuing to address DWARF32 overflowing .debug_str.dwo/.debug_str_offsets.dwo sections in some large binaries at Google.
So the extreme cases I'm dealing with are predominantly C++ Expression templates (in TensorFlow and Eigen) - these produce types with very large DW_AT_names ("f1<int>") and DW_AT_linkage_names (eg: "_Z2f1IiEvv") (but with many more template parameters, none of which are ever user-written but deduced). So the main fix I'm pursuing (roughly called "simplified template names") is to omit template parameter lists from DW_AT_names of templates in most cases, allowing the consumer to reconstruct the name from DW_AT_template_*_parameters itself, recursively. Further discussion and details here: https://groups.google.com/g/llvm-dev/c/ekLMllbLIZg/m/-dhJ0hO1AAAJ - in terms of how this affects scaling factors, it means that adding an additional template instantiation of existing types would add no new data to .debug_str (eg: going from a program with "t1<int>" to "t1<t1<int>>" would add no new entries to .debug_str). Not all names can be readily reconstructed - so I'm opting the feature out on those, but we could have a more deeper discussion about how to handle them if we wanted to make this a full-fledged/robust feature (maybe one the DWARF spec suggests/encourages). GDB seems to handle this sort of debug info OK - I guess someone did real work to support that at some point (so maybe some other debugger already generates DWARF like this). The other half, though, is DW_AT_linkage_names - and in theory similar rebuilding could be done, but that'd require baking a lot fo implementation knowledge into the DWARF Consumer that DWARF is meant to help avoid... so I'm unsure what the right solution is there just now, but there's a few ideas I'm still kicking around. At least linkage names have less redundancy (within a single name they avoid redundancy - "t1<t1<int>, t1<int>>" only ends up with a single description of "t1<int>" instead of two of them like you get with the DW_AT_name) than DW_AT_names, so they do scale a bit better already. Happy to discuss these ideas in specific, or their impact on debug_str growth in more detail any time (here, video chat, discords, etc). - Dave
_______________________________________________ Dwarf-Discuss mailing list Dwarf-Discuss@lists.dwarfstd.org http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org