+Pavel Labath <lab...@google.com> - since he's hit some issues related to this in lldb.
Oh, yeah, I thought there was a case where we mangled the function parameter into the mangled name of a lambda - but I might've misremembered. The global variable case seems the closest to that and appears previously in this thread - scoping types within variable DIEs seems weird enough that I'm not convinced it's a great direction to go... Unnamed things are... unnameable, so their scope seems sort of unimportant to a degree - and maybe just providing an identifier (the mangled name of the type) to allow a consumer to match them up would be good. On Tue, Feb 28, 2023 at 4:07 PM David Blaikie <dblai...@gmail.com> wrote: > Hmm - I guess one complication of only putting the mangling number on > the type, is that you need the scope of the lambda too... which is > tricky in this case: > > extern int i; > int i = []{ return 3; }(); > > In this case, the lambda is mangled in the scope of the global > variable `i`: i::{lambda()#1}::operator()() const > (https://godbolt.org/z/15Eqa8ajT) > > Oh, and I guess you can use a lambda without ever instantiating its > operator(), and for a generic lambda there's nothing to describe... > > eg: > template<typename T> > void f1(const T&){} > inline void f2() { > f1([](auto){}); > } > void f3() { > f2(); > } > > Clang's DWARF for the anonymous type is: > 0x00000043: DW_TAG_class_type > DW_AT_calling_convention (DW_CC_pass_by_value) > DW_AT_byte_size (0x01) > DW_AT_decl_file > ("/usr/local/google/home/blaikie/dev/scratch/test.cpp") > DW_AT_decl_line (4) > > GCC's includes a dtor (called "~<lambda>") but the type just has size, > file, line, and column. > > So we could avoid using the whole mangled name of the anonymous type > in some cases - maybe it's worth having features (like being able to > provide the mangling number in an attribute, maybe being able to scope > the type inside a variable DIE? though that sounds a bit frightening) > to help in those cases, even if in some of the worst cases we'd have > to use the mangled name to reassociate anonymous types? > > - Dave > > On Mon, Aug 22, 2022 at 12:44 PM David Blaikie <dblai...@gmail.com> wrote: > > > > Ping - any thoughts here? > > > > On Sun, Jul 24, 2022 at 9:08 PM David Blaikie <dblai...@gmail.com> > wrote: > > > > > > Ping on this thread - would love to hear what ideas folks have for > > > addressing the naming of anonymous types (enums, structs/classes, and > > > lambdas) - especially if it'd make it easier to go back/forth between > > > the DW_AT_name of a template with an unnamed type as a parameter and > > > the actual DIEs describing the same parameter type. > > > > > > On Tue, Jun 14, 2022 at 1:02 PM David Blaikie <dblai...@gmail.com> > wrote: > > > > > > > > Looks like https://reviews.llvm.org/D122766 (-ffile-reproducible) > might solve my immediate issues in clang, but I think we should still > consider moving to a more canonical naming of lambdas that, necessarily, > doesn't include the file name (unfortunately). Probably has to include the > lambda numbering/something roughly equivalent to the mangled lambda name - > it could include type information (it'd be superfluous to a unique > identifier, but I don't think it would break consistently naming the same > type across CUs either). > > > > > > > > Anyone got ideas/preferences/thoughts on this? > > > > > > > > On Mon, Jan 24, 2022 at 5:51 PM David Blaikie <dblai...@gmail.com> > wrote: > > > >> > > > >> On Mon, Jan 24, 2022 at 5:37 PM Adrian Prantl <apra...@apple.com> > wrote: > > > >>> > > > >>> > > > >>> > > > >>> On Jan 23, 2022, at 2:53 PM, David Blaikie <dblai...@gmail.com> > wrote: > > > >>> > > > >>> A rather common "quality of implementation" issue seems to be > lambda naming. > > > >>> > > > >>> I came across this due to non-canonicalization of lambda names in > template parameters depending on how a source file is named in Clang, and > GCC's seem to be very ambiguous: > > > >>> > > > >>> $ cat tmp/lambda.h > > > >>> template<typename T> > > > >>> void f1(T) { } > > > >>> static int i = (f1([]{}), 1); > > > >>> static int j = (f1([]{}), 2); > > > >>> void f1() { > > > >>> f1([]{}); > > > >>> f1([]{}); > > > >>> } > > > >>> $ cat tmp/lambda.cpp > > > >>> #ifdef I_PATH > > > >>> #include <tmp/lambda.h> > > > >>> #else > > > >>> #include "lambda.h" > > > >>> #endif > > > >>> $ clang++-tot tmp/lambda.cpp -g -c -I. -DI_PATH && > llvm-dwarfdump-tot lambda.o | grep "f1<" > > > >>> DW_AT_name ("f1<(lambda at > ./tmp/lambda.h:3:20)>") > > > >>> DW_AT_name ("f1<(lambda at > ./tmp/lambda.h:4:20)>") > > > >>> DW_AT_name ("f1<(lambda at > ./tmp/lambda.h:6:6)>") > > > >>> DW_AT_name ("f1<(lambda at > ./tmp/lambda.h:7:6)>") > > > >>> $ clang++-tot tmp/lambda.cpp -g -c && llvm-dwarfdump-tot lambda.o > | grep "f1<" > > > >>> DW_AT_name ("f1<(lambda at > tmp/lambda.h:3:20)>") > > > >>> DW_AT_name ("f1<(lambda at > tmp/lambda.h:4:20)>") > > > >>> DW_AT_name ("f1<(lambda at > tmp/lambda.h:6:6)>") > > > >>> DW_AT_name ("f1<(lambda at > tmp/lambda.h:7:6)>") > > > >>> $ g++-tot tmp/lambda.cpp -g -c -I. && llvm-dwarfdump-tot lambda.o > | grep "f1<" > > > >>> DW_AT_name ("f1<f1()::<lambda()> >") > > > >>> DW_AT_name ("f1<f1()::<lambda()> >") > > > >>> DW_AT_name ("f1<<lambda()> >") > > > >>> > > > >>> DW_AT_name ("f1<<lambda()> >") > > > >>> > > > >>> (I came across this in the context of my simplified template names > work - rebuilding names from the DW_TAG description of the template > parameters - and while I'm not rebuilding names that have lambda parameters > (keep encoding the full string instead). The issue is if some other type > depending on a type with a lambda parameter - but then multiple uses of > that inner type exist, from different translation units (using type units) > with different ways of naming the same file - so then the expected name has > one spelling, but the actual spelling is different due to the "./") > > > >>> > > > >>> But all this said - it'd be good to figure out a reliable naming - > the naming we have here, while usable for humans (pointing to surce files, > etc) - they don't reliably give unique names for each lambda/template > instantiation which would make it difficult for a consumer to know if two > entities are the same (important for types - is some function parameter the > same type as another type?) > > > >>> > > > >>> While it's expected cross-producer (eg: trying to be compatible > with GCC and Clang debug info) you have to do some fuzzy matching (eg: > "f1<int*>" or "f1<int *>" at the most basic - there are more complicated > cases) - this one's not possible with the data available. > > > >>> > > > >>> The source file/line/column is insufficient to uniquely identify a > lambda (multiple lambdas stamped out by a macro would get all the same > file/line/col) and valid code (albeit unlikely) that writes the same > definition in multiple places could make the same lambda have different > names. > > > >>> > > > >>> We should probably use something more like the way various ABI > manglings do to identify these entities. > > > >>> > > > >>> But we should probably also do this for other unnamed types that > have linkage (need to/would benefit from being matched up between two CUs), > even not lambdas. > > > >>> > > > >>> FWIW, at least the llvm-cxxfilt demanglings of clang's manglings > for these symbols is: > > > >>> > > > >>> void f1<$_0>($_0) > > > >>> f1<$_1>($_1) > > > >>> void f1<f1()::$_2>(f1()::$_2) > > > >>> void f1<f1()::$_3>(f1()::$_3) > > > >>> > > > >>> Should we use that instead? > > > >>> > > > >>> > > > >>> The only other information that the current human-readable DWARF > name carries is the file+line and that is fully redundant with > DW_AT_file/line, so the above scheme seem reasonable to me. Poorly > symbolicated backtraces would be worse in this scheme, so I'm expecting > most pushback from users who rely on a tool that just prints the human > readable name with no source info. > > > >> > > > >> > > > >> Yeah - you can always pull the file/line/col from the DW_AT_decl_* > anyway, so encoding it in the type name does seem redundant and inefficient > indeed (beyond/independent of the correctness issues). > > > >>> > > > >>> GCC's mangling's different (in these examples that's OK, since > they're all internal linkage): > > > >>> > > > >>> void f1<f1()::'lambda0'()>(f1()::'lambda0'()) > > > >>> void f1<f1()::'lambda'()>(f1()::'lambda'()) > > > >>> > > > >>> If I add an example like this: > > > >>> > > > >>> inline auto f1() { return []{}; } > > > >>> > > > >>> and instantiate the template with the result of f1: > > > >>> > > > >>> void f1<f2()::'lambda'()>(f2()::'lambda'()) > > > >>> > > > >>> GCC: > > > >>> > > > >>> void f1<f2()::'lambda'()>(f2()::'lambda'()) > > > >>> > > > >>> So they consistently use the same mangling - we could use the same > naming for template parameters? > > > >>> > > > >>> How should we communicate this sort of identity for unnamed types > in the DIEs describing the types themselves (not just the string of a > template name of a type instantiated with the unnamed type) so the unnamed > type can be matched up between translation units. > > > >>> > > > >>> eg, if I have these two translation units: > > > >>> // header > > > >>> inline auto f1() { struct { } local; return local; } > > > >>> // unit 1: > > > >>> #include "header" > > > >>> auto f2(decltype(f1())) { } > > > >>> // unit 2: > > > >>> #include "header" > > > >>> decltype(f1()) v1; > > > >>> > > > >>> Currently the DWARF produced for this unnamed type is: > > > >>> 0x0000003f: DW_TAG_structure_type > > > >>> DW_AT_calling_convention > (DW_CC_pass_by_value) > > > >>> DW_AT_byte_size (0x01) > > > >>> DW_AT_decl_file > ("/usr/local/google/home/blaikie/dev/scratch/test.cpp") > > > >>> DW_AT_decl_line (1) > > > >>> > > > >>> > > > >>> is this the type of struct {}? > > > >> > > > >> > > > >> Yep. You'll get separate distinct descriptions that are essentially > the same - imagine if `f1` had two such types written as "struct {}" (say > they were used to instantiate two different templates - "struct {} a; > struct {} b; f_templ(a); f_templ(b);" - the DWARF will have two of those > unnamed DW_TAG_structure_types and two template specializations, etc - but > no way to know which of those unnamed types line up with uses in another > translation unit, in terms of overload resolution, etc. > > > >>> > > > >>> So there's no way to know if you see that structure type > definition in two different translation units whether they refer to the > same type because there may be multiple types that have the same DWARF > description. (so no way to know if the DWARF consumer should allow the user > to evaluate an expression `f2(v1)` or not, I think?) > > > >>> > > > >>> > > > >>> Does a C++ compiler usually treat structurally equivalent but > differently named types as interchangeable? > > > >> > > > >> > > > >> No - given "struct A { int i; }; struct B { int i; }; void f1(A); > ... " - "f1(A())" is valid, but "f1(B())" is invalid and an error at > compile-time. https://godbolt.org/z/de7Yce1qW > > > >> > > > >>> > > > >>> Does a C++ compiler usually treat structurally equivalent > anonymous types as interchangeable? > > > >> > > > >> > > > >> No, same rules apply as named types: > https://godbolt.org/z/hxWMYbWc8 > > > >> > > > >>> > > > >>> > > > >>> -- adrian > > > >>> > > > >>> > > > >>> I guess the only way to have an unnamed type with linkage is to > use it inside an inline function - so within that scope you'd have to > produce DWARF for any types consistently in all definitions of the function > and then a consumer could match them up by counting (assuming the unnamed > types were always emitted in the same order in the child DIE list)... > > > >>> > > > >>> But this all seems a bit subtle & maybe would benefit from a more > robust/explicit description? > > > >>> > > > >>> Perhaps adding an integer attribute to number anonymous types? > They'd need to differentiate between lambdas and other anonymous types, > since they have separate numberings. > > > >>> > > > >>> >
-- Dwarf-discuss mailing list Dwarf-discuss@lists.dwarfstd.org https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss