Re: [Dwarf-Discuss] CU-local types

2022-06-14 Thread David Blaikie via Dwarf-Discuss
On Wed, May 18, 2022 at 9:53 AM David Blaikie  wrote:
>
> On Wed, May 18, 2022 at 4:16 AM Robinson, Paul  wrote:
> >
> > > Looks like gdb and lldb both have issues with C++ local types (either
> > > types defined in anonymous namespaces, or otherwise localized - eg: a
> > > non-local template with a local type or variable in one of its
> > > parameters).
> > > ...
> > > So... what could/should we do about this?
> >
> > Do you have a strong argument for why these are not debugger bugs?
> > It sounds to me like gdb/lldb are handling anonymous namespaces
> > incorrectly, in effect treating their contents as global rather than
> > CU-local.
>
> Oh, right, sorry forgot to include the trickier examples.
>
> So for a non-template this isn't especially burdensome (check for an
> anonymous namespace in the parent scopes - it's language specific, but
> not a ton of weird work to do) - for templates it's a bit harder (you
> have to check every template parameter, and potentially arbitrarily
> deep - eg: you might have a template parameter that's a function type
> and one of the parameters is a pointer type and the type the pointer
> points to is local - thus the template is local. That seems a bit more
> of a stretch to ask the consumer to do totally reliably) - but the
> worst case, that at the moment there's potentially no way to
> disambiguate whether the type is local or not: A non-type template
> parameter that points to a local variable.
>
> static int x = 3;
> template struct t1 { };
> t1<&x> v;
>
> Currently both LLVM and GCC name this type "t1<&x>" and LLVM at least
> puts a DW_AT_location on the DW_TAG_template_value_parameter which
> points to the global variable (not the DW_TAG_variable, but to the
> actual ELF symbol in the file) - though this choice has some negative
> effects (causes the symbol to be "used" and linked in - which means
> that enabling debug info can effect the behavior of the program
> (global ctors in that file execute when they wouldn't've otherwise,
> etc)).
>
> If the location is provided, arguably the consumer could lookup the
> symbol and check its linkage (not always accurate - LTO might've
> internalized a variable that wasn't actually internal in the original
> source, for instance) - but when the location is not provided there's
> no way to know whether "t1<&x>" is local or not.

Ping - anyone got further ideas about how to address this issue/encode
this information?
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] debug_aranges use and overhead

2022-06-14 Thread David Blaikie via Dwarf-Discuss
Given the discussion previously in this thread - does anyone have
particular objections to removing .debug_aranges? (in favor of/perhaps
with specific wording that /requires/ CU level ranges to be specified
(ie: it's not acceptable to have a subprogram with non-empty range in
a CU which doesn't cover that range) - so a consumer can look at the
CU and, if it has no ranges, conclude that it has no addresses covered
and skip the CU for address-related computation purposes)
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] debug_aranges use and overhead

2022-06-14 Thread Samy Al Bahra via Dwarf-Discuss
At this point, performance is good enough for our use-case, no qualms from
me.


On Tue, Jun 14, 2022 at 4:10 PM David Blaikie via Dwarf-Discuss <
dwarf-discuss@lists.dwarfstd.org> wrote:

> Given the discussion previously in this thread - does anyone have
> particular objections to removing .debug_aranges? (in favor of/perhaps
> with specific wording that /requires/ CU level ranges to be specified
> (ie: it's not acceptable to have a subprogram with non-empty range in
> a CU which doesn't cover that range) - so a consumer can look at the
> CU and, if it has no ranges, conclude that it has no addresses covered
> and skip the CU for address-related computation purposes)
> ___
> Dwarf-Discuss mailing list
> Dwarf-Discuss@lists.dwarfstd.org
> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
>


-- 
Samy Al Bahra [http://repnop.org]
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] CU-local types

2022-06-14 Thread Greg Clayton via Dwarf-Discuss
Template types are emitted for C++ in DWARF as specialized instances only, 
there is no generic definition of the type. One of the issues that impedes LLDB 
from functioning correctly in the expression parser for C++ with templates is 
how the accelerator table entries are emitted. If you have a 
"std::vector", the only accelerator table entry, if there is one at all, 
contains this full name ("std::vector"). LLDB uses clang as the expression 
parser, so if someone types this into their expression, the end up with one 
entry that points this full name to the matching entry. LLDB acts and a 
precompiled header for clang when evaluating expressions, so first we will try 
and find "std" in the accelerator tables and we will find the namespace, then 
clang will ask for the name "vector" to be found within the "std" namespace and 
we will never find it since the name of a class is always fully specified. With 
functions we end up with the base name of the function as the DW_AT_name and we 
have the DW_AT_linkage_name for the mangled name, both of which will appear in 
the accelerator tables. But for classes we don't have the base name of the 
class at the DW_AT_name, so we never will find this template class unless we 
again ignore all accelerator tables and generate them ourselves each time we 
debug. Granted this can be fixed in LLDB at great cost of having to parse every 
DIE in all units each time we start debugging so we can make an index that 
works for these lookups, but that cost is prohibitive.

> On Jun 14, 2022, at 1:04 PM, David Blaikie via Dwarf-Discuss 
>  wrote:
> 
> On Wed, May 18, 2022 at 9:53 AM David Blaikie  > wrote:
>> 
>> On Wed, May 18, 2022 at 4:16 AM Robinson, Paul  
>> wrote:
>>> 
 Looks like gdb and lldb both have issues with C++ local types (either
 types defined in anonymous namespaces, or otherwise localized - eg: a
 non-local template with a local type or variable in one of its
 parameters).
 ...
 So... what could/should we do about this?
>>> 
>>> Do you have a strong argument for why these are not debugger bugs?
>>> It sounds to me like gdb/lldb are handling anonymous namespaces
>>> incorrectly, in effect treating their contents as global rather than
>>> CU-local.
>> 
>> Oh, right, sorry forgot to include the trickier examples.
>> 
>> So for a non-template this isn't especially burdensome (check for an
>> anonymous namespace in the parent scopes - it's language specific, but
>> not a ton of weird work to do) - for templates it's a bit harder (you
>> have to check every template parameter, and potentially arbitrarily
>> deep - eg: you might have a template parameter that's a function type
>> and one of the parameters is a pointer type and the type the pointer
>> points to is local - thus the template is local. That seems a bit more
>> of a stretch to ask the consumer to do totally reliably) - but the
>> worst case, that at the moment there's potentially no way to
>> disambiguate whether the type is local or not: A non-type template
>> parameter that points to a local variable.
>> 
>> static int x = 3;
>> template struct t1 { };
>> t1<&x> v;
>> 
>> Currently both LLVM and GCC name this type "t1<&x>" and LLVM at least
>> puts a DW_AT_location on the DW_TAG_template_value_parameter which
>> points to the global variable (not the DW_TAG_variable, but to the
>> actual ELF symbol in the file) - though this choice has some negative
>> effects (causes the symbol to be "used" and linked in - which means
>> that enabling debug info can effect the behavior of the program
>> (global ctors in that file execute when they wouldn't've otherwise,
>> etc)).
>> 
>> If the location is provided, arguably the consumer could lookup the
>> symbol and check its linkage (not always accurate - LTO might've
>> internalized a variable that wasn't actually internal in the original
>> source, for instance) - but when the location is not provided there's
>> no way to know whether "t1<&x>" is local or not.
> 
> Ping - anyone got further ideas about how to address this issue/encode
> this information?
> ___
> Dwarf-Discuss mailing list
> Dwarf-Discuss@lists.dwarfstd.org 
> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org 
> 
___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] debug_aranges use and overhead

2022-06-14 Thread Greg Clayton via Dwarf-Discuss
As long as there is a DW_AT_ranges on the CU the is complete, that is good 
enough for LLDB. No one seems to consistently emit .debug_aranges these days so 
we definitely don't rely on it.

Greg

> On Jun 14, 2022, at 1:10 PM, David Blaikie via Dwarf-Discuss 
>  wrote:
> 
> Given the discussion previously in this thread - does anyone have
> particular objections to removing .debug_aranges? (in favor of/perhaps
> with specific wording that /requires/ CU level ranges to be specified
> (ie: it's not acceptable to have a subprogram with non-empty range in
> a CU which doesn't cover that range) - so a consumer can look at the
> CU and, if it has no ranges, conclude that it has no addresses covered
> and skip the CU for address-related computation purposes)
> ___
> Dwarf-Discuss mailing list
> Dwarf-Discuss@lists.dwarfstd.org
> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org

___
Dwarf-Discuss mailing list
Dwarf-Discuss@lists.dwarfstd.org
http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org


Re: [Dwarf-Discuss] CU-local types

2022-06-14 Thread David Blaikie via Dwarf-Discuss
On Tue, Jun 14, 2022 at 2:01 PM Greg Clayton  wrote:
>
> Template types are emitted for C++ in DWARF as specialized instances only, 
> there is no generic definition of the type. One of the issues that impedes 
> LLDB from functioning correctly in the expression parser for C++ with 
> templates is how the accelerator table entries are emitted. If you have a 
> "std::vector", the only accelerator table entry, if there is one at all, 
> contains this full name ("std::vector"). LLDB uses clang as the 
> expression parser, so if someone types this into their expression, the end up 
> with one entry that points this full name to the matching entry. LLDB acts 
> and a precompiled header for clang when evaluating expressions, so first we 
> will try and find "std" in the accelerator tables and we will find the 
> namespace, then clang will ask for the name "vector" to be found within the 
> "std" namespace and we will never find it since the name of a class is always 
> fully specified. With functions we end up with the base name of the function 
> as the DW_AT_name and we
  have the DW_AT_linkage_name for the mangled name, both of which will appear 
in the accelerator tables. But for classes we don't have the base name of the 
class at the DW_AT_name, so we never will find this template class unless we 
again ignore all accelerator tables and generate them ourselves each time we 
debug. Granted this can be fixed in LLDB at great cost of having to parse every 
DIE in all units each time we start debugging so we can make an index that 
works for these lookups, but that cost is prohibitive.

This is certainly an issue (though might be a bit different from what
you've described - at least at a quick glance it looks like for
"ns::t" we get separate entries for "ns" and for the unqualified
"t" (but not for "t") in the accelerator tables) - and with
simplified template names we may get "t" in the accelerator table
rather than "t" (& so then you'll get "t" entries for "t"
and "t", etc... and have to disambiguate them)

But that's, I think, a different topic from the one this thread is
about - how to identify which types are CU-local and which types are
not. Maybe the template accelerated access issue would be worth
another/separate thread.

>
> On Jun 14, 2022, at 1:04 PM, David Blaikie via Dwarf-Discuss 
>  wrote:
>
> On Wed, May 18, 2022 at 9:53 AM David Blaikie  wrote:
>
>
> On Wed, May 18, 2022 at 4:16 AM Robinson, Paul  wrote:
>
>
> Looks like gdb and lldb both have issues with C++ local types (either
> types defined in anonymous namespaces, or otherwise localized - eg: a
> non-local template with a local type or variable in one of its
> parameters).
> ...
> So... what could/should we do about this?
>
>
> Do you have a strong argument for why these are not debugger bugs?
> It sounds to me like gdb/lldb are handling anonymous namespaces
> incorrectly, in effect treating their contents as global rather than
> CU-local.
>
>
> Oh, right, sorry forgot to include the trickier examples.
>
> So for a non-template this isn't especially burdensome (check for an
> anonymous namespace in the parent scopes - it's language specific, but
> not a ton of weird work to do) - for templates it's a bit harder (you
> have to check every template parameter, and potentially arbitrarily
> deep - eg: you might have a template parameter that's a function type
> and one of the parameters is a pointer type and the type the pointer
> points to is local - thus the template is local. That seems a bit more
> of a stretch to ask the consumer to do totally reliably) - but the
> worst case, that at the moment there's potentially no way to
> disambiguate whether the type is local or not: A non-type template
> parameter that points to a local variable.
>
> static int x = 3;
> template struct t1 { };
> t1<&x> v;
>
> Currently both LLVM and GCC name this type "t1<&x>" and LLVM at least
> puts a DW_AT_location on the DW_TAG_template_value_parameter which
> points to the global variable (not the DW_TAG_variable, but to the
> actual ELF symbol in the file) - though this choice has some negative
> effects (causes the symbol to be "used" and linked in - which means
> that enabling debug info can effect the behavior of the program
> (global ctors in that file execute when they wouldn't've otherwise,
> etc)).
>
> If the location is provided, arguably the consumer could lookup the
> symbol and check its linkage (not always accurate - LTO might've
> internalized a variable that wasn't actually internal in the original
> source, for instance) - but when the location is not provided there's
> no way to know whether "t1<&x>" is local or not.
>
>
> Ping - anyone got further ideas about how to address this issue/encode
> this information?
> ___
> Dwarf-Discuss mailing list
> Dwarf-Discuss@lists.dwarfstd.org
> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
>
>
___

Re: [Dwarf-Discuss] lambda (& other anonymous type) identification/naming

2022-06-14 Thread David Blaikie via Dwarf-Discuss
Looks like https://reviews.llvm.org/D122766 (-ffile-reproducible) might
solve my immediate issues in clang, but I think we should still consider
moving to a more canonical naming of lambdas that, necessarily, doesn't
include the file name (unfortunately). Probably has to include the lambda
numbering/something roughly equivalent to the mangled lambda name - it
could include type information (it'd be superfluous to a unique identifier,
but I don't think it would break consistently naming the same type across
CUs either).

Anyone got ideas/preferences/thoughts on this?

On Mon, Jan 24, 2022 at 5:51 PM David Blaikie  wrote:

> On Mon, Jan 24, 2022 at 5:37 PM Adrian Prantl  wrote:
>
>>
>>
>> On Jan 23, 2022, at 2:53 PM, David Blaikie  wrote:
>>
>> A rather common "quality of implementation" issue seems to be lambda
>> naming.
>>
>> I came across this due to non-canonicalization of lambda names in
>> template parameters depending on how a source file is named in Clang, and
>> GCC's seem to be very ambiguous:
>>
>> $ cat tmp/lambda.h
>> template
>> void f1(T) { }
>> static int i = (f1([]{}), 1);
>> static int j = (f1([]{}), 2);
>> void f1() {
>>   f1([]{});
>>   f1([]{});
>> }
>> $ cat tmp/lambda.cpp
>> #ifdef I_PATH
>> #include 
>> #else
>> #include "lambda.h"
>> #endif
>> $ clang++-tot tmp/lambda.cpp -g -c -I. -DI_PATH && llvm-dwarfdump-tot
>> lambda.o | grep "f1<"
>> DW_AT_name  ("*f1<*(lambda at ./tmp/lambda.h:3:20)>")
>> DW_AT_name  ("*f1<*(lambda at ./tmp/lambda.h:4:20)>")
>> DW_AT_name  ("*f1<*(lambda at ./tmp/lambda.h:6:6)>")
>> DW_AT_name  ("*f1<*(lambda at ./tmp/lambda.h:7:6)>")
>> $ clang++-tot tmp/lambda.cpp -g -c && llvm-dwarfdump-tot lambda.o | grep
>> "f1<"
>> DW_AT_name  ("*f1<*(lambda at tmp/lambda.h:3:20)>")
>> DW_AT_name  ("*f1<*(lambda at tmp/lambda.h:4:20)>")
>> DW_AT_name  ("*f1<*(lambda at tmp/lambda.h:6:6)>")
>> DW_AT_name  ("*f1<*(lambda at tmp/lambda.h:7:6)>")
>> $ g++-tot tmp/lambda.cpp -g -c -I. && llvm-dwarfdump-tot lambda.o | grep
>> "f1<"
>> DW_AT_name  ("*f1<*f1():: >")
>> DW_AT_name  ("*f1<*f1():: >")
>> DW_AT_name  ("*f1<* >")
>>
>> DW_AT_name  ("*f1<* >")
>>
>> (I came across this in the context of my simplified template names work -
>> rebuilding names from the DW_TAG description of the template parameters -
>> and while I'm not rebuilding names that have lambda parameters (keep
>> encoding the full string instead). The issue is if some other type
>> depending on a type with a lambda parameter - but then multiple uses of
>> that inner type exist, from different translation units (using type units)
>> with different ways of naming the same file - so then the expected name has
>> one spelling, but the actual spelling is different due to the "./")
>>
>> But all this said - it'd be good to figure out a reliable naming - the
>> naming we have here, while usable for humans (pointing to surce files, etc)
>> - they don't reliably give unique names for each lambda/template
>> instantiation which would make it difficult for a consumer to know if two
>> entities are the same (important for types - is some function parameter the
>> same type as another type?)
>>
>> While it's expected cross-producer (eg: trying to be compatible with GCC
>> and Clang debug info) you have to do some fuzzy matching (eg: "f1" or
>> "f1" at the most basic - there are more complicated cases) - this
>> one's not possible with the data available.
>>
>> The source file/line/column is insufficient to uniquely identify a lambda
>> (multiple lambdas stamped out by a macro would get all the same
>> file/line/col) and valid code (albeit unlikely) that writes the same
>> definition in multiple places could make the same lambda have different
>> names.
>>
>> We should probably use something more like the way various ABI manglings
>> do to identify these entities.
>>
>> But we should probably also do this for other unnamed types that have
>> linkage (need to/would benefit from being matched up between two CUs), even
>> not lambdas.
>>
>> FWIW, at least the llvm-cxxfilt demanglings of clang's manglings for
>> these symbols is:
>>
>>  void f1<$_0>($_0)
>>  f1<$_1>($_1)
>>  void f1(f1()::$_2)
>>  void f1(f1()::$_3)
>>
>> Should we use that instead?
>>
>>
>> The only other information that the current human-readable DWARF name
>> carries is the file+line and that is fully redundant with DW_AT_file/line,
>> so the above scheme seem reasonable to me. Poorly symbolicated backtraces
>> would be worse in this scheme, so I'm expecting most pushback from users
>> who rely on a tool that just prints the human readable name with no source
>> info.
>>
>
> Yeah - you can always pull the file/line/col from the DW_AT_decl_* anyway,
> so encoding it in the type name does seem redundant and inefficient ind