zacharymorn commented on pull request #128: URL: https://github.com/apache/lucene/pull/128#issuecomment-843795302
> I love this change -- `CheckIndex` is often run in the "heat of the moment" so time is of the essence to recovering your Lucene index and getting things back online. Yet it is [very slow today, even on highly concurrent boxes](https://home.apache.org/~mikemccand/lucenebench/checkIndexTime.html). > > I left some small comments that I think are fine to do in a followon PR. This change is already massive enough (GitHub was at first not willing to even render the `CheckIndex` diffs!) and impactful enough that we can do the improvements after. Thanks Michael for the review and feedback. I think as far as the original scope of the jira ticket goes, there's also the parallelization across segments that has not been implemented yet. But agree that this PR is already big and should already provide a good speed up on powerful concurrent boxes (up to 11 concurrent checks for each segment), so we can probably let it run for a while and see if parallelization across segments is still needed, which from my quick in-mind coding will definitely require much more changes for concurrency control to get it right. One thing I'm still researching is that, it seems there's limited direct test coverage for this `CheckIndex` class? I see there's `TestCheckIndex`, but it only has 4 tests, and the majority of its functionalities seems to be put under tests by other index testing utilities and test cases. Shall I still add a few more tests for these changes (and should I put them in `TestCheckIndex`)? On the other hand, I've been running `./gradlew check -Ptests.nightly=true -Pvalidation.git.failOnModified=false` the whole time and all tests have been passing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
