epotyom commented on PR #13657: URL: https://github.com/apache/lucene/pull/13657#issuecomment-2293467656
I've made some temporary changes in luceneutil to be able to only run a couple of tasks that show regression and have meaningful profiler results - profiler results that we get for all tasks seems to have too many samples for other tasks e.g. faceting. Results after 20 runs: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value MedTerm 581.44 (3.8%) 505.29 (3.5%) -13.1% ( -19% - -5%) 0.000 HighTerm 559.77 (3.5%) 501.17 (3.5%) -10.5% ( -16% - -3%) 0.000 ``` The biggest difference in the profiler seems to be that we spend more time in `org.apache.lucene.search.similarities.BM25Similarity$BM25Scorer.score(float, long)` now? Diff image:  JFR files: [Archive.zip](https://github.com/user-attachments/files/16637306/Archive.zip) The code that was tested is slightly different from this PR, sharing branches just in case: - candidate: Branch with regression: https://github.com/epotyom/lucene/tree/IndexSearcher-search-regression - baseline: Branch with NO regression: https://github.com/epotyom/lucene/tree/IndexSearcher-search-NO-regression - Their diff: https://github.com/epotyom/lucene/pull/1/files luceneutil branch to reproduce: https://github.com/mikemccand/luceneutil/compare/main...epotyom:luceneutil:tasks_with_regression you'd need to generate task file manually as it seems to be to large for for github: ``` rm tasks/wikimedium.10M.regressed.tasks cat tasks/wikimedium.10M.nostopwords.tasks | egrep '^(MedTerm|HighTerm):' > tasks/wikimedium.10M.regressed.tasks.1 for n in {1..10000}; do cat tasks/wikimedium.10M.regressed.tasks.1 >> tasks/wikimedium.10M.regressed.tasks; done ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org