dexonsmith added subscribers: Bigcheese, jansvoboda11, arphaman.
dexonsmith added a comment.

In D92160#2444732 <https://reviews.llvm.org/D92160#2444732>, @OikawaKirie wrote:

> - In my solution, I use a pretty straightforward approach, which is to have 
> an individual cache (for `FileEntry` and any other related things) for each 
> working directory, and switch caches between working directories.

IIUC, this means the absolute paths won't be cached between changing working 
directories. Is that correct?

> - In your alternative solution, if I standerstand it correctly, 
> `dropRelativePaths` is called in `notifyCurrentWorkingDirectoryChange`. All 
> those `FileEntry` which are based on relative paths are dropped from the 
> cache. I think this is a good idea, especially for keeping `FileManager` 
> simple. However I'm not sure how many files are based on relative paths. If 
> `dropRelativePaths` drops too many files and the working directory is 
> switched back to a previous one (this could happen when analyzing with a 
> `compile_commands.json`), it might result in many cache misses, thus 
> involving more system calls to open files.

Yes, I think that's the tradeoff. My sense is that `FileManager` is 
usually/often fed absolute paths, but it likely depends on the build system and 
environment.

> There was an another solution I have ever tried, without things like 
> `notifyCurrentWorkingDirectoryChange`. The solution is still straightforward, 
> which is to use absolute path (instead of filename) to query/store a 
> `FileEntry` from/to the cache. Actually I have tried this before but it ended 
> up with lots of test cases failed. I don't know where I did wrong. If you 
> think this approach is okay, I will continue working on this, and it might 
> take some time.

I'm not surprised some tests failed with this but it might be worth looking 
deeper to understand current expectations. It might be valuable to look through 
Git history to find why we're not always making the paths absolute right now. 
We do it sometimes (see calls to `FixupRelativePath` and `makeAbsolutePath`) 
but not always. Why is that the case? What invariants do we need to keep 
functioning?

Another point is that `FileManager` takes the working directory at construction 
time via `FileSystemOptions`. There's a FIXME to remove that argument, but (as 
you've seen) the assumption is baked in that the CWD doesn't change. I think 
the current patch may not completely fix it, since you can't modify 
`FileSystemOptions`.

One idea I have (not fully fleshed out) is to drop FileSystemOpts as a 
constructor argument (instead grab / store the CWD from the VFS), and, inspired 
by your approach, split the data structures between relative and absolute 
paths. The existing data structures would only store absolute paths, but there 
are new ones for relative paths that can be cached and swapped out. Something 
like this:

  /// Current working directory. Grabbed from the (new) VFS whenever it's
  /// changed, and updated if the file manager is notified that the VFS's
  /// CWD changes.
  std::string CWD;
  
  struct RelativePaths {
    StringMap<FileEntryRef::MapValue> Files;
    StringMap<llvm::ErrorOr<DirectoryEntry &>> Dirs;
  };
  
  /// Relative entries for the current CWD.
  RelativePaths RelativeEntries;
  
  /// Cached relative entries for other CWDs.
  Optional<StringMap<RelativePaths>> CachedRelativeEntries;

The idea is that the existing data structures would now be restricted to 
absolute paths. Any query that uses a relative path checks in 
`RelativeEntries`; if nothing is there, the path is fixed up to an absolute 
path, inserted / looked up in the main data structures, and then an alias is 
added to `RelativeEntries`. If the CWD changes (atypical case, but possible), 
we do a similar cache dance to the one in your current patch.

@arphaman @Bigcheese, can you share whether we need to solve similar issues in 
clang-scan-deps? What approach is taken there? Do we just require absolute 
paths? (I know currently we don't reuse a FileManager, but my recollection is 
we had a prototype that mostly worked...)

Another thing to look at: is this a problem for implicit modules builds (the 
`CompilerInstance` that builds the module will share a `FileManager`, but use a 
different CWD)? Why or why not? (Again, maybe we just require absolute paths 
when doing implicit modules...)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92160/new/

https://reviews.llvm.org/D92160

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to