ilya-biryukov added inline comments.

================
Comment at: clangd/XRefs.cpp:578
+    bool Invalid;
+    StringRef Buffer = SM.getBufferData(SM.getFileID(StartLoc), &Invalid);
+    if (!Invalid) {
----------------
ilya-biryukov wrote:
> ilya-biryukov wrote:
> > malaperle wrote:
> > > ilya-biryukov wrote:
> > > > Unfortunately we can't get the buffer here, because for the preamble 
> > > > macros the files might change on disk after we build the preamble and 
> > > > we can end up reading a different version of the file. Which can in 
> > > > turn lead to crashes as the offsets obtained from the source locations 
> > > > can point outside the buffer for the corresponding file.
> > > > 
> > > > This is really annoying and generally the solution is to try find a 
> > > > different way to obtain the same information, e.g. by looking at what 
> > > > `MacroInfo` has available.
> > > > However, I don't know of a good workaround for this. Happy to look into 
> > > > ways of providing something close to a macro definition from 
> > > > `MacroInfo` or other sources, can scout for this tomorrow.
> > > I tried to do something like MacroInfo::dump. One problem is that we 
> > > don't always have enough information about literals. For example:
> > > 
> > > ```
> > > #define MACRO 0
> > > int main() { return MACRO; }
> > > ```
> > > Becomes:
> > > ```
> > > #define MACRO numeric_constant
> > > ```
> > > 
> > > I poked around the Token code and I didn't find a way to retrieve the 
> > > literal without going through the buffer (Preprocessor::getSpelling for 
> > > example).
> > > 
> > > What you mention is only a problem when you use the option of storing 
> > > preambles on disk, right? So something would need to modify the preamble 
> > > files on disk, which are stored in some temp folder. It doesn't sound 
> > > like that could happen frequently. There is also bounds checking in the 
> > > code above so it shouldn't crash, right? Even if the bounds are 
> > > incorrect, it will be no different than the Range we return to the client 
> > > for document/definition, i.e. the client can use the range on a newly 
> > > modified header and get it wrong. I also tested renaming the pch file and 
> > > then editing the source file and it results in an invalid AST. So either 
> > > way modifying the pch would be bad news not for just macro hover.
> > The problem shows up when the header files are modified, not the preamble 
> > files (otherwise we'll be fine, in clangd we do assume the .pch files we 
> > produce are not externally modified).
> > Which is pretty frequent, and we've seen real crashes coming from fetching 
> > documentation from the preambles.
> > 
> > I'll try to investigate how Preprocessor accesses the tokens of macro 
> > definitions from preabmle. There are two possible outcomes I expect:
> > 1. we find a way to access the spelling of the tokens in the same way PP 
> > does and it's safe,
> > 2. we realize preprocessor might crash with the preabmle because it 
> > accesses the spelling of the tokens.
> > 
> > I hope to find (1), but (2) is also very realistic, unfortunately :-(
> I think the preprocessor does the same thing, therefore the compiler can 
> potentially access the header files to get spelling of the tokens coming from 
> headers.
> This means that clangd can potentially crash if the headers are changed 
> concurrently with the operations that use preabmles (code completion, being 
> the most common one).
> 
> In that case, I agree that it's fine to use the buffers for headers here and 
> we should fix this separately, probably by snapshotting the contents of 
> accessed headers.
And I'm happy to take a look at reproducing the PP crash that I'm talking about 
here and fixing this.


Repository:
  rCTE Clang Tools Extra

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55250/new/

https://reviews.llvm.org/D55250



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to