[Lldb-commits] [PATCH] D50071: Use rich mangling information in Symtab::InitNameIndexes()

Stefan Gränitz via Phabricator via lldb-commits Thu, 02 Aug 2018 11:22:45 -0700

sgraenitz added a comment.

> (However if you are interested in something like this, then it might be 
> interesting to look at whether this MethodName stuff couldn't be properly 
> pluginified. Something like where a Language class registers a callback for a 
> specific mangling type, and then accessing that through this)


Yes this sounds more like a plan. An isolated refactoring patch without would 
be appropriate I think. In contrast, making a quick fix just to avoid the 
`llvm::Any` seems less appealing to me.



================
Comment at: include/lldb/Core/RichManglingInfo.h:83-84
+public:
+  RichManglingInfo *SetItaniumInfo();
+  RichManglingInfo *SetLegacyCxxParserInfo(const ConstString &mangled);
+
----------------
labath wrote:
> sgraenitz wrote:
> > labath wrote:
> > > I find it odd that one of these functions is stateless 
> > > (`SetLegacyCxxParserInfo` takes the string which initializes the object 
> > > as an argument), while the other receives the state magically via the IPD 
> > > accessor.
> > > 
> > > Would it be possible to change it so that the other option is stateless 
> > > too (`SetItaniumInfo(ConstString mangled)`)?
> > > 
> > > We could change `RichManglingInfo` to return the full demangled name too, 
> > > so that you have access to the demangled name via this API too (which /I 
> > > think/ is the reason why you created a separate 
> > > `GetItaniumRichDemangleInfo` function).
> > > Would it be possible to change it so that the other option is stateless 
> > > too (`SetItaniumInfo(ConstString mangled)`)?
> > I think that would be misleading for the reader. Symmetry is always nice to 
> > indicate congruent behavior, but this is not the case here. The fallback 
> > C++ parser version mimics the previous behavior: demangle the name in step 
> > 1 and if that succeeds, parse it with `CPlusPlusLanguage::MethodName` to 
> > decode its properties. The IPD does that in a single step and so there is 
> > nothing to do in `SetItaniumInfo()`. `SetLegacyCxxParserInfo()` on the 
> > other hand has to create a `CPlusPlusLanguage::MethodName` from the mangled 
> > string.
> > 
> > > We could change `RichManglingInfo` to return the full demangled name too, 
> > > so that you have access to the demangled name via this API too (which /I 
> > > think/ is the reason why you created a separate 
> > > GetItaniumRichDemangleInfo function).
> > The interface wants to provide access to the name's encoded properties. 
> > That doesn't really include the demangled "human readable" string 
> > representation (it's also not needed in 
> > `Symtab::RegisterMangledNameEntry()`).
> > 
> > Of course `Symtab::InitNameIndexes()` will ask for `GetDemangledName()` 
> > just after processing the mangled one. I think chances are good that for 
> > symbols that are not filtered out anyway, we reached 
> > `DemangleWithRichManglingInfo()` earlier and it will be in the string pool 
> > already (see Mangled.cpp:322). Otherwise, yes it will be demangled in with 
> > a new IPD instance.
> > 
> > I think it's not bad to keep these cases as they are. I would propose to 
> > investigate the performance of the current implementation in more detail 
> > and think about the benefits before mixing them. Not sure at all, if it 
> > would bring measurable gain.
> I don't see how this would be misleading. Both versions provide a way to 
> access the name's properties. The fact that they take slightly different 
> inputs (one takes a mangled name, one demangled) makes them slightly 
> different, but I don't find that too troublesome. We could encode that 
> difference in the method name to make it more obvious. What troubles me more 
> is that you have two-step initialization for the IPD case. It's always nice 
> to avoid that, and that's particularly important when it's asymmetric with 
> the other method.
> 
> Let me try to elaborate on how I'd do this. `RichManglingContext` (ps: maybe 
> it's not correct to call this a *mangling* context, because the second method 
> does not use mangled names. `RichMethodContext` ?) would have two methods:
> `fromItaniumName(ConstString ItaniumMangled)` and 
> `fromCxxMethodName(ConstString Name)`. Both would return `RichManglingInfo` 
> (`RichMethodInfo` ?) as they do now.
> 
> Then the usage would be something like:
> ```
>   case eManglingSchemeItanium:
>     auto *info = context.fromItaniumName(m_mangled);
>     m_demangled.SetCStringWithMangledCounterPart(info->GetFullName(), 
> m_mangled);
>     return info;
> 
>   case eManglingSchemeMSVC:
>     
> m_demangled.SetCStringWithMangledCounterpart(demangle_manually(m_mangled), 
> m_mangled);
>     return context.fromCxxMethodName(m_demangled);
> ```
> The only change necessary for this is to include the full demangled name in 
> the "name's encoded properties". While that may not be a *property* of the 
> name by the strictest definition, I don't think it's much of a stretch to 
> include it there, and it does not require making any compromises, as both 
> methods already have access to this information.
> 
> So overall, I believe this should have no impact on performance, while 
> improving the "symmetry" of the code (which is somewhat subjective, but I am 
> hoping that we can agree that a having single method is better than having 
> two methods which must be called in the right order). Is there anything I'm 
> missing here?
Ok, sounds reasonable. Let's try that. I will prepare a patch.


================
Comment at: source/Core/RichManglingInfo.cpp:36-40
+RichManglingContext::SetLegacyCxxParserInfo(const ConstString &mangled) {
+  m_info.ResetProvider();
+  m_info.m_provider = RichManglingInfo::PluginCxxLanguage;
+  m_info.m_legacy_parser = new CPlusPlusLanguage::MethodName(mangled);
+  return &m_info;
----------------
labath wrote:
> sgraenitz wrote:
> > labath wrote:
> > > Is this really the **mangled** name?
> > Yes, it is.
> How is that possible? I see nothing in CPlusPlusLanguage::MethodName which 
> would perform the demangling, and yet it clearly operates on demangled names. 
> Also, the LHS of this patch creates MethodName instances like this
> `CPlusPlusLanguage::MethodName 
> cxx_method(mangled.GetDemangledName(lldb::eLanguageTypeC_plus_plus));`
> 
> 
Oh right, I should revisit that.

While we are here: It's a pity that only MSVC-mangled names trigger the 
fallback case with the new implementation. Thus, my unit test has no coverage 
for it and I didn't recognize the mistake. I thought about integrating 
ObjCLanguage parsing here too. On the one hand this would improve test 
coverage. On the other hand it would cause a semantic change in name indexing 
(and my ObjC knowledge is quite limited).

What do you think?


https://reviews.llvm.org/D50071



_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [PATCH] D50071: Use rich mangling information in Symtab::InitNameIndexes()

Reply via email to