mayya-sharipova commented on issue #1351: LUCENE-9280: Collectors to skip noncompetitive documents URL: https://github.com/apache/lucene-solr/pull/1351#issuecomment-615261001 Sorry for bringing this up and not finishing, but I thought that is also worth to report the test results on a smaller collection `wikimedium1m`: ``` TaskQPS baseline StdDevQPS patch StdDev Pct diff TermDTSort 292.71 (15.1%) 59.60 (4.9%) -79.6% ( -86% - -70%) HighTermDayOfYearSort 60.01 (44.0%) 33.75 (13.6%) -43.8% ( -70% - 24%) WARNING: cat=HighTermDayOfYearSort: hit counts differ: 65216 vs 65093+ WARNING: cat=TermDTSort: hit counts differ: 68644 vs 507+ ``` Here there is a substantial reduction in performance by using the proposed sort optimization. As the data in these indexes are not monotonically increasing `setBottom` is called many times. Looks like for smaller indexes (especially with data that is not monotonically increasing) it is faster just to do the conventional sort than the proposed optimization. I am not sure how significant is this reduction. - **Should we apply the optimization only for segments over 1 million docs?** - **Should we apply the optimization only when the data is diverse enough?** Or we can follow up on these proposals in subsequent PRs?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org