mikemccand commented on pull request #128: URL: https://github.com/apache/lucene/pull/128#issuecomment-851527406
Thank you for all the awesome iterations here @zacharymorn! To get the best speedup, even at `-slow`, we should do concurrency both ways, and then sort those tasks by decreasing expected cost. This way the work queue would first output all postings checks (across all segments), one per thread, followed by doc values, etc. We could even get a bit crazy, e.g. checking postings for a tiny segment is surely expected to be faster than checking doc values for a massive segment. But we can add such complexity later -- the PR now ("thread per segment") is surely a great step forward too :) And +1 to spinoff a separate issue to change `CheckIndex` to default to `-fast` -- this is really long overdue since we added end-to-end checksums to Lucene! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org