Re: [lldb-dev] Resolving dynamic type based on RTTI fails in case of type names inequality in DWARF and mangled symbols

2017-12-18 Thread xgsa via lldb-dev
Thank you for clarification, Jim, you are right, I misunderstood a little bit 
what lldb actually does.

It is not that the compiler can't be fixed, it's about the fact that relying on 
correspondence of mangled and demangled forms are not reliable enough, so we 
are looking for more robust alternatives. Moreover, I am not sure that such 
fuzzy matching could be done just basing on class name, so it will require 
reading more DIEs. Taking into account that, for instance, in our project there 
are quite many such types, it could noticeable slow down the debugger.

Thus, I'd like to mention one more alternative and get your feedback, if 
possible. Actually, what is necessary is the correspondence of mangled and 
demangled vtable symbol. Possibly, it worth preparing a separate section during 
compilation (like e.g. apple_types), which would store this correspondence? It 
will work fast and be more reliable than the current approach, but certainly, 
will increase debug info size (however, cannot estimate which exact increase 
will be, e.g. in persent).

What do you think? Which solution is preferable?

Thanks,
Anton.

15.12.2017, 23:34, "Jim Ingham" :
> First off, just a technical point. lldb doesn't use RTTI to find dynamic 
> types, and in fact works for projects like lldb & clang that turn off RTTI. 
> It just uses the fact that the vtable symbol for an object demangles to:
>
> vtable for CLASSNAME
>
> That's not terribly important, but I just wanted to make sure people didn't 
> think lldb was doing something fancy with RTTI... Note, gdb does (or at least 
> used to do) dynamic detection the same way.
>
> If the compiler can't be fixed, then it seems like your solution [2] is what 
> we'll have to try.
>
> As it works now, we get the CLASSNAME from the vtable symbol and look it up 
> in the the list of types. That is pretty quick because the type names are 
> indexed, so we can find it with a quick search in the index. Changing this 
> over to a method where we do some additional string matching rather than just 
> using the table's hashing is going to be a fair bit slower because you have 
> to run over EVERY type name. But this might not be that bad. You would first 
> look it up by exact CLASSNAME and only fall back on your fuzzy match if this 
> fails, so most dynamic type lookups won't see any slowdown. And if you know 
> the cases where you get into this problem you can probably further restrict 
> when you need to do this work so you don't suffer this penalty for every 
> lookup where we don't have debug info for the dynamic type. And you could 
> keep a side-table of mangled-name -> DWARF name, and maybe a black-list for 
> unfound names, so you only have to do this once.
>
> This estimation is based on the assumption that you can do your work just on 
> the type names, without having to get more type information out of the DWARF 
> for each candidate match. A solution that relies on realizing every class in 
> lldb so you can get more information out of the type information to help with 
> the match will defeat all our attempts at lazy DWARF reading. This can cause 
> quite long delays in big programs. So I would be much more worried about a 
> solution that requires this kind of work. Again, if you can reject most 
> potential candidates by looking at the name, and only have to realize a few 
> likely types, the approach might not be that slow.
>
> Jim
>
>>  On Dec 15, 2017, at 7:11 AM, xgsa via lldb-dev  
>> wrote:
>>
>>  Sorry, I probably shouldn't have used HTML for that message. Converted to 
>> plain text.
>>
>>   Original message 
>>  15.12.2017, 18:01, "xgsa" :
>>
>>  Hi,
>>
>>  I am working on issue that in C++ program for some complex cases with 
>> templates showing dynamic type based on RTTI in lldb doesn't work properly. 
>> Consider the following example:
>>  enum class TagType : bool
>>  {
>> Tag1
>>  };
>>
>>  struct I
>>  {
>> virtual ~I() = default;
>>  };
>>
>>  template 
>>  struct Impl : public I
>>  {
>>  private:
>> int v = 123;
>>  };
>>
>>  int main(int argc, const char * argv[]) {
>> Impl impl;
>> I& i = impl;
>> return 0;
>>  }
>>
>>  For this example clang generates type name "Impl" in DWARF 
>> and "__ZTS4ImplIL7TagType0EE" when mangling symbols (which lldb demangles to 
>> Impl<(TagType)0>). Thus when in 
>> ItaniumABILanguageRuntime::GetTypeInfoFromVTableAddress() lldb tries to 
>> resolve the type, it is unable to find it. More cases and the detailed 
>> description why lldb fails here can be found in this clang review, which 
>> tries to fix this in clang [1].
>>
>>  However, during the discussion around this review [2], it was pointed out 
>> that DWARF names are expected to be close to sources, which clang does 
>> perfectly, whereas mangling algorithm is strictly defined. Thus matching 
>> them on equality could sometimes fail. The suggested idea in [2] was to 
>> implement more semantically aware matching. There is enough

Re: [lldb-dev] Resolving dynamic type based on RTTI fails in case of type names inequality in DWARF and mangled symbols

2017-12-18 Thread Tamas Berghammer via lldb-dev
Hi Anton and Jim,

What do you think about storing the mangled type name or the mangled vtable
symbol name somewhere in DWARF in the DW_AT_MIPS_linkage_name attribute? We
are already doing it for the mangled names of functions so extending it to
types shouldn't be too controversial.

Tamas

On Mon, 18 Dec 2017, 17:29 xgsa via lldb-dev, 
wrote:

> Thank you for clarification, Jim, you are right, I misunderstood a little
> bit what lldb actually does.
>
> It is not that the compiler can't be fixed, it's about the fact that
> relying on correspondence of mangled and demangled forms are not reliable
> enough, so we are looking for more robust alternatives. Moreover, I am not
> sure that such fuzzy matching could be done just basing on class name, so
> it will require reading more DIEs. Taking into account that, for instance,
> in our project there are quite many such types, it could noticeable slow
> down the debugger.
>
> Thus, I'd like to mention one more alternative and get your feedback, if
> possible. Actually, what is necessary is the correspondence of mangled and
> demangled vtable symbol. Possibly, it worth preparing a separate section
> during compilation (like e.g. apple_types), which would store this
> correspondence? It will work fast and be more reliable than the current
> approach, but certainly, will increase debug info size (however, cannot
> estimate which exact increase will be, e.g. in persent).
>
> What do you think? Which solution is preferable?
>
> Thanks,
> Anton.
>
> 15.12.2017, 23:34, "Jim Ingham" :
> > First off, just a technical point. lldb doesn't use RTTI to find dynamic
> types, and in fact works for projects like lldb & clang that turn off RTTI.
> It just uses the fact that the vtable symbol for an object demangles to:
> >
> > vtable for CLASSNAME
> >
> > That's not terribly important, but I just wanted to make sure people
> didn't think lldb was doing something fancy with RTTI... Note, gdb does (or
> at least used to do) dynamic detection the same way.
> >
> > If the compiler can't be fixed, then it seems like your solution [2] is
> what we'll have to try.
> >
> > As it works now, we get the CLASSNAME from the vtable symbol and look it
> up in the the list of types. That is pretty quick because the type names
> are indexed, so we can find it with a quick search in the index. Changing
> this over to a method where we do some additional string matching rather
> than just using the table's hashing is going to be a fair bit slower
> because you have to run over EVERY type name. But this might not be that
> bad. You would first look it up by exact CLASSNAME and only fall back on
> your fuzzy match if this fails, so most dynamic type lookups won't see any
> slowdown. And if you know the cases where you get into this problem you can
> probably further restrict when you need to do this work so you don't suffer
> this penalty for every lookup where we don't have debug info for the
> dynamic type. And you could keep a side-table of mangled-name -> DWARF
> name, and maybe a black-list for unfound names, so you only have to do this
> once.
> >
> > This estimation is based on the assumption that you can do your work
> just on the type names, without having to get more type information out of
> the DWARF for each candidate match. A solution that relies on realizing
> every class in lldb so you can get more information out of the type
> information to help with the match will defeat all our attempts at lazy
> DWARF reading. This can cause quite long delays in big programs. So I would
> be much more worried about a solution that requires this kind of work.
> Again, if you can reject most potential candidates by looking at the name,
> and only have to realize a few likely types, the approach might not be that
> slow.
> >
> > Jim
> >
> >>  On Dec 15, 2017, at 7:11 AM, xgsa via lldb-dev <
> lldb-dev@lists.llvm.org> wrote:
> >>
> >>  Sorry, I probably shouldn't have used HTML for that message. Converted
> to plain text.
> >>
> >>   Original message 
> >>  15.12.2017, 18:01, "xgsa" :
> >>
> >>  Hi,
> >>
> >>  I am working on issue that in C++ program for some complex cases with
> templates showing dynamic type based on RTTI in lldb doesn't work properly.
> Consider the following example:
> >>  enum class TagType : bool
> >>  {
> >> Tag1
> >>  };
> >>
> >>  struct I
> >>  {
> >> virtual ~I() = default;
> >>  };
> >>
> >>  template 
> >>  struct Impl : public I
> >>  {
> >>  private:
> >> int v = 123;
> >>  };
> >>
> >>  int main(int argc, const char * argv[]) {
> >> Impl impl;
> >> I& i = impl;
> >> return 0;
> >>  }
> >>
> >>  For this example clang generates type name "Impl" in
> DWARF and "__ZTS4ImplIL7TagType0EE" when mangling symbols (which lldb
> demangles to Impl<(TagType)0>). Thus when in
> ItaniumABILanguageRuntime::GetTypeInfoFromVTableAddress() lldb tries to
> resolve the type, it is unable to find it. More cases and the detailed
> description why ll

Re: [lldb-dev] Resolving dynamic type based on RTTI fails in case of type names inequality in DWARF and mangled symbols

2017-12-18 Thread Robinson, Paul via lldb-dev
The linkage-name attribute was really intended for definitions of objects that 
have static memory addresses (static/global variables, and functions), but 
adding it to a class description would have an obvious meaning and seems 
completely in line with how DWARF works.
Given the size of mangled names, you probably want to do this only for 
definitions of classes that have vtables.  With that caveat, I'd have no 
problem doing this.
--paulr

From: lldb-dev [mailto:lldb-dev-boun...@lists.llvm.org] On Behalf Of Tamas 
Berghammer via lldb-dev
Sent: Monday, December 18, 2017 12:00 PM
To: xgsa
Cc: lldb-dev@lists.llvm.org
Subject: Re: [lldb-dev] Resolving dynamic type based on RTTI fails in case of 
type names inequality in DWARF and mangled symbols


Hi Anton and Jim,

What do you think about storing the mangled type name or the mangled vtable 
symbol name somewhere in DWARF in the DW_AT_MIPS_linkage_name attribute? We are 
already doing it for the mangled names of functions so extending it to types 
shouldn't be too controversial.

Tamas

On Mon, 18 Dec 2017, 17:29 xgsa via lldb-dev, 
mailto:lldb-dev@lists.llvm.org>> wrote:
Thank you for clarification, Jim, you are right, I misunderstood a little bit 
what lldb actually does.

It is not that the compiler can't be fixed, it's about the fact that relying on 
correspondence of mangled and demangled forms are not reliable enough, so we 
are looking for more robust alternatives. Moreover, I am not sure that such 
fuzzy matching could be done just basing on class name, so it will require 
reading more DIEs. Taking into account that, for instance, in our project there 
are quite many such types, it could noticeable slow down the debugger.

Thus, I'd like to mention one more alternative and get your feedback, if 
possible. Actually, what is necessary is the correspondence of mangled and 
demangled vtable symbol. Possibly, it worth preparing a separate section during 
compilation (like e.g. apple_types), which would store this correspondence? It 
will work fast and be more reliable than the current approach, but certainly, 
will increase debug info size (however, cannot estimate which exact increase 
will be, e.g. in persent).

What do you think? Which solution is preferable?

Thanks,
Anton.

15.12.2017, 23:34, "Jim Ingham" mailto:jing...@apple.com>>:
> First off, just a technical point. lldb doesn't use RTTI to find dynamic 
> types, and in fact works for projects like lldb & clang that turn off RTTI. 
> It just uses the fact that the vtable symbol for an object demangles to:
>
> vtable for CLASSNAME
>
> That's not terribly important, but I just wanted to make sure people didn't 
> think lldb was doing something fancy with RTTI... Note, gdb does (or at least 
> used to do) dynamic detection the same way.
>
> If the compiler can't be fixed, then it seems like your solution [2] is what 
> we'll have to try.
>
> As it works now, we get the CLASSNAME from the vtable symbol and look it up 
> in the the list of types. That is pretty quick because the type names are 
> indexed, so we can find it with a quick search in the index. Changing this 
> over to a method where we do some additional string matching rather than just 
> using the table's hashing is going to be a fair bit slower because you have 
> to run over EVERY type name. But this might not be that bad. You would first 
> look it up by exact CLASSNAME and only fall back on your fuzzy match if this 
> fails, so most dynamic type lookups won't see any slowdown. And if you know 
> the cases where you get into this problem you can probably further restrict 
> when you need to do this work so you don't suffer this penalty for every 
> lookup where we don't have debug info for the dynamic type. And you could 
> keep a side-table of mangled-name -> DWARF name, and maybe a black-list for 
> unfound names, so you only have to do this once.
>
> This estimation is based on the assumption that you can do your work just on 
> the type names, without having to get more type information out of the DWARF 
> for each candidate match. A solution that relies on realizing every class in 
> lldb so you can get more information out of the type information to help with 
> the match will defeat all our attempts at lazy DWARF reading. This can cause 
> quite long delays in big programs. So I would be much more worried about a 
> solution that requires this kind of work. Again, if you can reject most 
> potential candidates by looking at the name, and only have to realize a few 
> likely types, the approach might not be that slow.
>
> Jim
>
>>  On Dec 15, 2017, at 7:11 AM, xgsa via lldb-dev 
>> mailto:lldb-dev@lists.llvm.org>> wrote:
>>
>>  Sorry, I probably shouldn't have used HTML for that message. Converted to 
>> plain text.
>>
>>   Original message 
>>  15.12.2017, 18:01, "xgsa" mailto:x...@yandex.ru>>:
>>
>>  Hi,
>>
>>  I am working on issue that in C++ program for some complex cases with 
>> templates showing dynamic t

Re: [lldb-dev] Resolving dynamic type based on RTTI fails in case of type names inequality in DWARF and mangled symbols

2017-12-18 Thread xgsa via lldb-dev
Hi Tamas, First, why DW_AT_MIPS_linkage_name, but not just DW_AT_linkage_name? The later is standartized and currently generated by clang at least on x64. Second, this doesn't help to solve the issue, because this will require parsing all the DWARF types during startup to build a map that breaks DWARF lazy load, performed by lldb. Or am I missing something? Thanks,Anton. 18.12.2017, 22:59, "Tamas Berghammer" :Hi Anton and Jim,What do you think about storing the mangled type name or the mangled vtable symbol name somewhere in DWARF in the DW_AT_MIPS_linkage_name attribute? We are already doing it for the mangled names of functions so extending it to types shouldn't be too controversial.Tamas On Mon, 18 Dec 2017, 17:29 xgsa via lldb-dev,  wrote:Thank you for clarification, Jim, you are right, I misunderstood a little bit what lldb actually does.It is not that the compiler can't be fixed, it's about the fact that relying on correspondence of mangled and demangled forms are not reliable enough, so we are looking for more robust alternatives. Moreover, I am not sure that such fuzzy matching could be done just basing on class name, so it will require reading more DIEs. Taking into account that, for instance, in our project there are quite many such types, it could noticeable slow down the debugger.Thus, I'd like to mention one more alternative and get your feedback, if possible. Actually, what is necessary is the correspondence of mangled and demangled vtable symbol. Possibly, it worth preparing a separate section during compilation (like e.g. apple_types), which would store this correspondence? It will work fast and be more reliable than the current approach, but certainly, will increase debug info size (however, cannot estimate which exact increase will be, e.g. in persent).What do you think? Which solution is preferable?Thanks,Anton.15.12.2017, 23:34, "Jim Ingham" :> First off, just a technical point. lldb doesn't use RTTI to find dynamic types, and in fact works for projects like lldb & clang that turn off RTTI. It just uses the fact that the vtable symbol for an object demangles to:>> vtable for CLASSNAME>> That's not terribly important, but I just wanted to make sure people didn't think lldb was doing something fancy with RTTI... Note, gdb does (or at least used to do) dynamic detection the same way.>> If the compiler can't be fixed, then it seems like your solution [2] is what we'll have to try.>> As it works now, we get the CLASSNAME from the vtable symbol and look it up in the the list of types. That is pretty quick because the type names are indexed, so we can find it with a quick search in the index. Changing this over to a method where we do some additional string matching rather than just using the table's hashing is going to be a fair bit slower because you have to run over EVERY type name. But this might not be that bad. You would first look it up by exact CLASSNAME and only fall back on your fuzzy match if this fails, so most dynamic type lookups won't see any slowdown. And if you know the cases where you get into this problem you can probably further restrict when you need to do this work so you don't suffer this penalty for every lookup where we don't have debug info for the dynamic type. And you could keep a side-table of mangled-name -> DWARF name, and maybe a black-list for unfound names, so you only have to do this once.>> This estimation is based on the assumption that you can do your work just on the type names, without having to get more type information out of the DWARF for each candidate match. A solution that relies on realizing every class in lldb so you can get more information out of the type information to help with the match will defeat all our attempts at lazy DWARF reading. This can cause quite long delays in big programs. So I would be much more worried about a solution that requires this kind of work. Again, if you can reject most potential candidates by looking at the name, and only have to realize a few likely types, the approach might not be that slow.>> Jim>>>  On Dec 15, 2017, at 7:11 AM, xgsa via lldb-dev  wrote:  Sorry, I probably shouldn't have used HTML for that message. Converted to plain text.   Original message >>  15.12.2017, 18:01, "xgsa" :  Hi,  I am working on issue that in C++ program for some complex cases with templates showing dynamic type based on RTTI in lldb doesn't work properly. Consider the following example:>>  enum class TagType : bool>>  {>> Tag1>>  };  struct I>>  {>> virtual ~I() = default;>>  };  template >>  struct Impl : public I>>  {>>  private:>> int v = 123;>>  };  int main(int argc, const char * argv[]) {>> Impl impl;>> I& i = impl;>> return 0;>>  }  For this example clang generates type name "Impl" in DWARF and "__ZTS4ImplIL7TagType0EE" when mangling symbols (which lldb demangles to 

Re: [lldb-dev] lldb test suite on macOS 10.13 (High Sierra)

2017-12-18 Thread Adrian Prantl via lldb-dev
I also just hit this and apparently this is an intentional behavior of xcrun.

Note that this only affects systems that have the so-called command line tools 
installed (this is what you get when you install the command line tools without 
installing Xcode).

When the command line tools are installed *and* xcrun is run without explicitly 
asking for an sdk, it will add /usr/local/include to the search path instead of 
adding the -isysroot /Applications/Xcode.app/.../MacOSX10.13.sdk that we want 
here. This explains why Pavel's workaround works.

I'm not yet sure whether requiring the macosx SDK in this file is always the 
right thing to do here or if there is a better solution.

-- adrian

> On Dec 8, 2017, at 2:02 PM, Pavel Labath via lldb-dev 
>  wrote:
> 
> In test/testcases/configuration.py:47, change "SDKROOT=" to "SDKROOT=macosx".
> 
> I did not put this out for review (yet) mainly because I was not sure
> whether this affects only our buildbot (which was updated from some
> ancient version straight to 10.13, so it could be something introduced
> by the upgrade).
> 
> However, if you are running into this as well (and I understand you
> have a fresh macbook :D), then I think we should just really fix it.
> 
> 
> On 8 December 2017 at 19:39, Davide Italiano  wrote:
>> Pavel, I happened to hit this.
>> I'm not sure how you worked around, as I tried to export
>> SDKROOT=macosx but that didn't work for me.
>> Do you have a patch or a series of commands I can run?
>> 
>> Thanks,
>> 
>> --
>> Davide
>> 
>> On Thu, Nov 16, 2017 at 4:25 AM, Pavel Labath via lldb-dev
>>  wrote:
>>> Right, after learning way more than I ever wanted to know about xcrun,
>>> I think I see the issue. When running with empty SDKROOT variable,
>>> xcrun sets SDKROOT to "/" when running clang:
>>> $ SDKROOT= xcrun --no-cache --log --verbose clang
>>> /Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.c
>>> -o 
>>> /Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.so
>>> -framework Python -Xlinker -dylib -v
>>> xcrun: note: PATH = '/usr/bin:/bin'
>>> xcrun: note: SDKROOT = '/'
>>> ^ WRONG. there are no
>>> python headers there
>>> 
>>> .
>>> 
>>> env SDKROOT=/ 
>>> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
>>> /Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.c
>>> -o 
>>> /Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.so
>>> -framework Python -Xlinker -dylib -v
>>> 
>>> ...
>>> 
>>> #include "..." search starts here:
>>> #include <...> search starts here:
>>> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/9.0.0/include
>>> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include
>>> /usr/include
>>> /System/Library/Frameworks (framework directory)
>>> /Library/Frameworks (framework directory)
>>>  clang gets the include path wrong
>>> 
>>> End of search list.
>>> /Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.c:15:10:
>>> fatal error: 'Python/Python.h' file not found
>>> #include 
>>> ^
>>> 1 error generated.
>>> ^ and errors out
>>> 
>>> 
>>> 
>>> On the other hand, if I invoke xcrun with SDKROOT=macosx, everything
>>> works just fine:
>>> 
>>> $ SDKROOT=macosx xcrun --no-cache --log --verbose clang
>>> /Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.c
>>> -o 
>>> /Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo.so
>>> -framework Python -Xlinker -dylib -v
>>> xcrun: note: looking up SDK with
>>> '/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild -sdk
>>> macosx -version Path'
>>> xcrun: note: PATH = '/usr/bin:/bin'
>>> xcrun: note: SDKROOT = 'macosx'
>>> xcrun: note: TOOLCHAINS = ''
>>> xcrun: note: DEVELOPER_DIR = '/Applications/Xcode.app/Contents/Developer'
>>> xcrun: note: XCODE_DEVELOPER_USR_PATH = ''
>>> xcrun: note: xcrun_db =
>>> '/var/folders/bt/tws6gynx0ws1cc4ss53_pvqmgq/T/xcrun_db'
>>> xcrun: note: lookup resolved to:
>>> '/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.13.sdk'
>>> xcrun: note: PATH = '/usr/bin:/bin'
>>> xcrun: note: SDKROOT =
>>> '/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.13.sdk'
>>>^^ correct. a valid SDK is located there
>>> 
>>> ...
>>> 
>>> env 
>>> SDKROOT=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.13.sdk
>>> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
>>> /Users/lldb_build/lldbSlave/buildDir/llvm/tools/lldb/packages/Python/lldbsuite/test/crashinfo