zturner added a comment.

In https://reviews.llvm.org/D52461#1280527, @aleksandr.urakov wrote:

> Update the diff according to the discussion, making it possible to parse MSVC 
> demangled names by `CPlusPlusLanguage`. The old PDB plugin still uses 
> `MSVCUndecoratedNameParser` directly because:
>
> - we are sure that the name in PDB is an MSVC name;
> - it has a more convenient interface, especially for restoring namespaces 
> from the parsed name.


So I had an interesting solution to this while working on the native pdb 
plugin.  it is impossible to use it with the old pdb plugin, but given that it 
works flawlessly for the native pdb plugin, depending on how urgent your need 
is, maybe you can just put off working on this until you're ready to move over 
to the native pdb plugin?

Basically the idea is that the raw PDB contains mangled type names for every 
type.  You can see this by dumping types using `llvm-pdbutil`, as follows (I 
just picked a random one from my build directory).

  D:\src\llvmbuild\ninja-x64>bin\llvm-pdbutil.exe dump -types bin\sancov.pdb | 
grep -A 2 LF_STRUCT | more
      0x1001 | LF_STRUCTURE [size = 88] ``anonymous-namespace'::RawCoverage`
               unique name: `.?AURawCoverage@?A0xa74cdb40@@`
               vtable: <no type>, base list: <no type>, field list: <no type>
  --
      0x100A | LF_STRUCTURE [size = 212] `std::default_delete<std::set<unsigned 
__int64,std::less<unsigned __int64>,std::allocator<unsigned __int64> > >`
               unique name: 
`.?AU?$default_delete@V?$set@_KU?$less@_K@std@@V?$allocator@_K@2@@std@@@std@@`
               vtable: <no type>, base list: <no type>, field list: <no type>
  --
      0x102B | LF_STRUCTURE [size = 88] ``anonymous-namespace'::FileHeader`
               unique name: `.?AUFileHeader@?A0xa74cdb40@@`
               vtable: <no type>, base list: <no type>, field list: <no type>
  --
      0x1031 | LF_STRUCTURE [size = 112] 
`std::default_delete<llvm::MemoryBuffer>`
               unique name: `.?AU?$default_delete@VMemoryBuffer@llvm@@@std@@`
               vtable: <no type>, base list: <no type>, field list: <no type>
  --
      0x1081 | LF_STRUCTURE [size = 304] 
`llvm::AlignedCharArrayUnion<std::unique_ptr<llvm::MemoryBuffer,std::default_delete<llvm::MemoryBuffer>
 >,char,char,char,char,char,char,char,char,char>`
               unique name: 
`.?AU?$AlignedCharArrayUnion@V?$unique_ptr@VMemoryBuffer@llvm@@U?$default_delete@VMemoryBuffer@llvm@@@std@@@std@@DDDDDDDDD@llvm@@`
               vtable: <no type>, base list: <no type>, field list: <no type>
  --
      0x1082 | LF_STRUCTURE [size = 176] 
`llvm::AlignedCharArrayUnion<std::error_code,char,char,char,char,char,char,char,char,char>`
               unique name: 
`.?AU?$AlignedCharArrayUnion@Verror_code@std@@DDDDDDDDD@llvm@@`
               vtable: <no type>, base list: <no type>, field list: <no type>

So the interesting thing here is this "unique name" field.  This is not 
possible to access via DIA SDK but it gives us complete rich information about 
the type that is otherwise impossible.  We don't even have to guess, because we 
can just demangle the name.  And coincidentally, I recently just finished 
writing an Microsoft ABI demangler which is now in LLVM.  :)   This `.?AU` 
syntax is non-standard, but it was easy for me to figure out, and I hacked up 
our demangle library to support this prefix (it's not checked in yet).  And 
basically everything that comes after it exactly matches a mangled type.

So, just to give an example.  Instead of teaching `CPlusPlusNameParser` to 
handle ``anonymous namespace'::RawCoverage`, we simply demangle 
`.?AURawCoverage@?A0xa74cdb40@@`, and we get back a vector of 2 strings which 
are ``anonymous namespace'` and `RawCoverage`.  But instead of just that, there 
are so many other benefits.  Since PDB doesn't contain rich information about 
template parameters, all we could do until now is just say create an entry in 
the AST that says "there's a type with this enormously long name that contains 
angle brackets and other junk".  But with this technique, we could actually 
create legitimate template decls in the AST the way it's supposed to be.

There is obviously a lot of complexity in doing it here, but I think long term 
it will be a richer experience if we parse the mangled name than if we parse 
the demangled name.  But it's only possible with the native plugin.

What do you think?


https://reviews.llvm.org/D52461



_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to