epotyom commented on PR #13657:
URL: https://github.com/apache/lucene/pull/13657#issuecomment-2293467656

   I've made some temporary changes in luceneutil to be able to only run a 
couple of tasks that show regression and have meaningful profiler results - 
profiler results that we get for all tasks seems to have too many samples for 
other tasks e.g. faceting.
   
   Results after 20 runs:
   
   ```
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                            MedTerm      581.44      (3.8%)      505.29      
(3.5%)  -13.1% ( -19% -   -5%) 0.000
                           HighTerm      559.77      (3.5%)      501.17      
(3.5%)  -10.5% ( -16% -   -3%) 0.000
   ```
   
   The biggest difference in the profiler seems to be that we spend more time 
in 
`org.apache.lucene.search.similarities.BM25Similarity$BM25Scorer.score(float, 
long)` now?
   
   Diff image:
   
   
![flamegraph_two_tasks_only](https://github.com/user-attachments/assets/847290e6-fb6a-440d-8131-a50759351234)
   
   
   JFR files: 
   [Archive.zip](https://github.com/user-attachments/files/16637306/Archive.zip)
   
   The code that was tested is slightly different from this PR, sharing 
branches just in case:
   
   - candidate: Branch with regression: 
https://github.com/epotyom/lucene/tree/IndexSearcher-search-regression
   - baseline: Branch with NO regression: 
https://github.com/epotyom/lucene/tree/IndexSearcher-search-NO-regression
   - Their diff: https://github.com/epotyom/lucene/pull/1/files
   
   luceneutil branch to reproduce: 
https://github.com/mikemccand/luceneutil/compare/main...epotyom:luceneutil:tasks_with_regression
   
   you'd need to generate task file manually as it seems to be to large for for 
github:
   
   ```
   rm tasks/wikimedium.10M.regressed.tasks
   cat tasks/wikimedium.10M.nostopwords.tasks | egrep '^(MedTerm|HighTerm):' > 
tasks/wikimedium.10M.regressed.tasks.1
   for n in {1..10000}; do cat tasks/wikimedium.10M.regressed.tasks.1 >> 
tasks/wikimedium.10M.regressed.tasks; done
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to