amccarth added a comment.
This patch was reverted a while back because a couple DirectoryWatcher tests
routinely timed out on build bots even though they work when run in isolation.
I believe the problem is that, on a machine busy doing a build, the startup of
the watcher thread is delayed (either because the scheduler algorithm or the
thread pool policy). Thus the "initial scan" on the test thread can complete
_before_ the watcher thread has called ReadDirectoryChangesW. This leaves a
window of time where the test can change a file that the watcher thread will
miss.
To test this hypothesis, I took this patch and created one more event called
`WatcherThreadReady`. I have the watcher thread set this event after
successfully calling ReadDirectoryChangesW. In the constructor, I changed:
if (WaitForInitialSync)
InitialScan();
to
if (WaitForInitialSync) {
::WaitForSingleObject(WatcherThreadReady, 10000);
InitialScan();
}
This is crude, but it seems to be effective. The tests pass reliably for me
when my machine is fully loaded. I didn't use an INFINITE timeout because it
seems possibly missing a file change is less bad than hanging forever. I
didn't even bother to check the result from the wait because there's nothing
sane to do besides "best effort" if something goes wrong. I used a Windows
event because those are most familiar to me and it's Windows-only code. But
it certainly could be done with other type of synchronization object.
There may be more elegant ways to solve this, but something like this directly
addresses the root cause with fairly minimal changes.
I wonder if the Linux and Mac implementations might suffer from a similar
window but the bug is rare because of differences in thread scheduler.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D88666/new/
https://reviews.llvm.org/D88666
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits