malaperle added a comment.

In https://reviews.llvm.org/D39050#949185, @malaperle wrote:

> In https://reviews.llvm.org/D39050#948500, @akyrtzi wrote:
>
> > @malaperle, to clarify we are not suggesting that you write your own 
> > parser, the suggestion is to use clang in 'fast-scan' mode to get the 
> > structure of the declarations of a single file, see 
> > `CXTranslationUnit_SingleFileParse` (along with enabling skipping of 
> > bodies). We have found clang is super fast when you only try to get the 
> > structure of a file like this.
>
>
> Thank you, that sounds very useful. I will try that and get some measurements.
>
> > We can make convenient APIs to provide the syntactic structure of 
> > declarations based on their location.
>
> Perhaps just for the end-loc since it's pretty much guaranteed to be needed 
> by everyone. But if it's very straightforward, perhaps that's not needed. 
> I'll try and see.
>
> > But let's say we added the end-loc, is it enough ? If you want to implement 
> > the 'peek the definition' like Eclipse, then it is not enough, you also 
> > need to figure out if there are documentation comments associated with the 
> > declaration and also show those. Also what if you want to highlight the 
> > type signature of a function, then just storing the location of the closing 
> > brace of its body is not enough. There can be any arbitrary things you may 
> > want to get from the structure of the declaration (e.g. the parameter 
> > ranges), but we could provide an API to gather any syntactic structure info 
> > you may want.
>
> That's a very good point. I guess in the back of my mind, I have the worry 
> that one cannot extend what is stored, either for a different performance 
> trade-off or for additional things. The fact that both clang and clangd have 
> to agree on the format so that index-while-building can be used seems to make 
> it inherently not possible to extend. But perhaps it's better to not 
> overthink this for now.


I did a bit more of experimenting. For the end-loc, I changed my prototype so 
that the end-loc is not stored in the index but rather computed "on the fly" 
using SourceManager and Lexer only. For my little benchmark, I used the 
LLVM/Clang/Clangd code base which I queried for all references of "std" (the 
namespace) which is around 46K references in the index.

With end-loc in index: 3.45s on average    (20 samples)
With end-loc computed on the fly: 11.33s on average    (20 samples)
I also tried with Xcode but without too much success: it took about 30 secs to 
reach 45K results and then carried on for a long time and hung (although I 
didn't try to leave it for hours to see if it finished).

From my perspective, it seems that the extra time is quite substantial and it 
doesn't seem worth to save an integer per occurrence in this case.

For computing the start/end-loc of function bodies, I tried the 
SingleFileParseMode and SkipFunctionBodies separately ( as a start). The source 
I use this on looks like this:

  #include "MyClass.h"
  
  MyClass::MyClass() {
  }
  
  void MyClass::doOperation() {
  }

With SingleFileParseMode, I get several errors:

> MyClass.cpp:5:1: error: use of undeclared identifier 'MyClass'
>  MyClass.cpp:8:6: error: use of undeclared identifier 'MyClass'

Then I cannot obtain any Decl* at the position of doOperation. With 
SingleFileParseMode, I'm also a bit weary that not processing headers will 
result in many inaccuracies. From our perspective, we are more wiling to 
sacrifice disk space in order to have more accuracy and speed. For comparison, 
the index I worked with containing all end-loc for occurrences and also 
function start/end is 201M for LLVM/Clang/Clangd which is small to us.

With SkipFunctionBodies alone, I can get the Decl* but 
FunctionDecl::getSourceRange() doesn't include the body, rather, it stops after 
the arguments.
It would be very nice if we could do this cheaply but it doesn't seem possible 
with those two flags alone. What did you have in mind for implementing an "API 
to gather any syntactic structure info" ?


https://reviews.llvm.org/D39050



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to