jpountz commented on PR #972: URL: https://github.com/apache/lucene/pull/972#issuecomment-1164436120
@zacharymorn FYI I played with a slightly different approach that implements BMM as a bulk scorer instead of a scorer, which I was hoping would help with making bookkeeping more lightweight: https://github.com/jpountz/lucene/tree/maxscore. It could be interesting to compare with your implementation. One optimization it has that seemed to help that your scorer doesn't have is to check for every non-essential scorer whether the score obtained so far plus the sum of max scores of non essential scorers that haven't been checked yet is still competitive. I got the following results on one run on wikimedium10m: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value OrHighNotLow 1493.13 (6.5%) 1445.29 (5.1%) -3.2% ( -13% - 8%) 0.083 OrNotHighMed 1410.19 (3.8%) 1373.37 (3.1%) -2.6% ( -9% - 4%) 0.017 OrNotHighHigh 1057.88 (5.1%) 1031.19 (4.4%) -2.5% ( -11% - 7%) 0.096 OrHighNotMed 1525.10 (5.2%) 1486.80 (4.4%) -2.5% ( -11% - 7%) 0.098 OrHighNotHigh 1250.31 (4.3%) 1221.99 (3.4%) -2.3% ( -9% - 5%) 0.062 IntNRQ 531.54 (2.9%) 522.49 (2.7%) -1.7% ( -7% - 3%) 0.053 Fuzzy1 111.13 (2.1%) 109.80 (2.6%) -1.2% ( -5% - 3%) 0.107 AndHighMed 386.29 (4.1%) 381.84 (3.3%) -1.2% ( -8% - 6%) 0.329 AndHighHigh 78.96 (5.6%) 78.18 (4.7%) -1.0% ( -10% - 9%) 0.548 BrowseDateSSDVFacets 4.51 (12.6%) 4.47 (12.4%) -0.8% ( -22% - 27%) 0.836 OrNotHighLow 1316.24 (3.8%) 1305.93 (3.1%) -0.8% ( -7% - 6%) 0.476 OrHighMedDayTaxoFacets 20.87 (5.1%) 20.71 (4.2%) -0.8% ( -9% - 9%) 0.609 BrowseMonthSSDVFacets 23.54 (6.4%) 23.42 (7.4%) -0.5% ( -13% - 14%) 0.817 BrowseRandomLabelTaxoFacets 37.54 (1.7%) 37.37 (1.9%) -0.5% ( -4% - 3%) 0.432 MedSpanNear 68.68 (1.7%) 68.37 (2.2%) -0.4% ( -4% - 3%) 0.474 AndHighHighDayTaxoFacets 10.78 (5.9%) 10.73 (4.7%) -0.4% ( -10% - 10%) 0.794 BrowseMonthTaxoFacets 28.39 (10.0%) 28.29 (9.1%) -0.3% ( -17% - 20%) 0.910 HighTermDayOfYearSort 171.78 (13.7%) 171.22 (13.2%) -0.3% ( -23% - 30%) 0.939 PKLookup 245.27 (2.2%) 244.52 (1.9%) -0.3% ( -4% - 3%) 0.635 HighSloppyPhrase 39.08 (2.9%) 38.96 (4.3%) -0.3% ( -7% - 7%) 0.795 HighTermMonthSort 167.47 (15.1%) 167.06 (14.7%) -0.2% ( -26% - 34%) 0.959 HighPhrase 250.14 (2.8%) 249.53 (2.3%) -0.2% ( -5% - 5%) 0.767 TermDTSort 138.22 (14.0%) 137.97 (13.4%) -0.2% ( -24% - 31%) 0.967 Fuzzy2 55.22 (1.6%) 55.17 (1.5%) -0.1% ( -3% - 3%) 0.837 MedTerm 1844.25 (6.4%) 1843.10 (4.9%) -0.1% ( -10% - 11%) 0.972 MedSloppyPhrase 15.34 (2.2%) 15.33 (3.9%) -0.1% ( -5% - 6%) 0.954 Prefix3 110.03 (2.6%) 110.07 (1.8%) 0.0% ( -4% - 4%) 0.962 HighSpanNear 7.95 (1.7%) 7.97 (1.7%) 0.2% ( -3% - 3%) 0.772 BrowseDayOfYearTaxoFacets 46.78 (1.9%) 46.86 (2.1%) 0.2% ( -3% - 4%) 0.788 AndHighLow 1291.99 (2.6%) 1294.28 (3.4%) 0.2% ( -5% - 6%) 0.854 LowSpanNear 47.55 (1.5%) 47.64 (1.4%) 0.2% ( -2% - 3%) 0.697 Wildcard 157.83 (1.5%) 158.14 (1.3%) 0.2% ( -2% - 3%) 0.661 LowPhrase 83.20 (2.3%) 83.37 (2.1%) 0.2% ( -4% - 4%) 0.773 Respell 95.18 (1.4%) 95.47 (1.3%) 0.3% ( -2% - 3%) 0.492 AndHighMedDayTaxoFacets 51.97 (1.8%) 52.16 (2.1%) 0.4% ( -3% - 4%) 0.553 BrowseDateTaxoFacets 45.77 (2.0%) 45.98 (1.9%) 0.5% ( -3% - 4%) 0.452 MedTermDayTaxoFacets 60.66 (5.9%) 61.03 (5.0%) 0.6% ( -9% - 12%) 0.718 MedPhrase 57.67 (3.1%) 58.06 (2.6%) 0.7% ( -4% - 6%) 0.452 BrowseDayOfYearSSDVFacets 20.40 (6.0%) 20.57 (4.2%) 0.8% ( -8% - 11%) 0.609 LowSloppyPhrase 37.59 (4.0%) 38.00 (3.6%) 1.1% ( -6% - 9%) 0.376 BrowseRandomLabelSSDVFacets 15.25 (5.2%) 15.41 (6.9%) 1.1% ( -10% - 13%) 0.571 HighTerm 2001.23 (6.4%) 2025.82 (4.9%) 1.2% ( -9% - 13%) 0.493 LowTerm 2092.97 (4.3%) 2119.02 (5.5%) 1.2% ( -8% - 11%) 0.423 MedIntervalsOrdered 56.91 (3.9%) 57.92 (3.0%) 1.8% ( -4% - 9%) 0.107 HighIntervalsOrdered 16.67 (6.2%) 16.97 (4.6%) 1.8% ( -8% - 13%) 0.297 LowIntervalsOrdered 20.18 (4.3%) 20.57 (3.3%) 1.9% ( -5% - 10%) 0.113 HighTermTitleBDVSort 182.32 (14.2%) 186.92 (22.0%) 2.5% ( -29% - 45%) 0.667 OrHighLow 1235.23 (1.8%) 1484.12 (4.8%) 20.1% ( 13% - 27%) 0.000 OrHighMed 156.75 (4.7%) 200.46 (4.7%) 27.9% ( 17% - 39%) 0.000 OrHighHigh 25.07 (5.2%) 48.30 (9.1%) 92.6% ( 74% - 112%) 0.000 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org