zacharymorn commented on pull request #113: URL: https://github.com/apache/lucene/pull/113#issuecomment-836122884
Hi @jpountz, I've ported your changes to this BulkScorer implementation as well, and run both 5 OrMed as well as full wikimedium5m benchmark: ``` OrMedMedMedMedMed run 1 TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value OrMedMedMedMedMed 40.90 (8.5%) 39.37 (6.8%) -3.7% ( -17% - 12%) 0.126 PKLookup 228.21 (1.9%) 223.87 (2.2%) -1.9% ( -5% - 2%) 0.004 ``` ``` OrMedMedMedMedMed run 2 TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value OrMedMedMedMedMed 39.72 (5.0%) 38.01 (7.4%) -4.3% ( -15% - 8%) 0.030 PKLookup 226.45 (2.1%) 223.28 (2.3%) -1.4% ( -5% - 3%) 0.048 ``` ``` OrMedMedMedMedMed run 3 TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value PKLookup 226.41 (3.3%) 222.43 (2.3%) -1.8% ( -7% - 3%) 0.052 OrMedMedMedMedMed 38.83 (6.7%) 39.27 (7.1%) 1.1% ( -11% - 15%) 0.600 ``` ``` full wikimedium5m run 1 TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value Wildcard 376.63 (5.8%) 360.47 (6.2%) -4.3% ( -15% - 8%) 0.024 OrNotHighHigh 745.74 (4.5%) 730.51 (5.7%) -2.0% ( -11% - 8%) 0.208 Fuzzy2 40.89 (6.0%) 40.20 (8.5%) -1.7% ( -15% - 13%) 0.465 HighTermDayOfYearSort 354.09 (16.6%) 348.53 (13.2%) -1.6% ( -26% - 33%) 0.740 BrowseMonthSSDVFacets 31.93 (3.0%) 31.50 (6.5%) -1.3% ( -10% - 8%) 0.402 LowTerm 1978.09 (5.1%) 1956.82 (5.3%) -1.1% ( -10% - 9%) 0.514 IntNRQ 194.54 (3.6%) 193.05 (4.2%) -0.8% ( -8% - 7%) 0.537 HighTermMonthSort 330.71 (10.6%) 328.18 (9.7%) -0.8% ( -19% - 21%) 0.812 OrHighNotLow 806.97 (6.4%) 801.14 (5.6%) -0.7% ( -11% - 11%) 0.702 BrowseDayOfYearSSDVFacets 28.57 (1.7%) 28.39 (2.0%) -0.6% ( -4% - 3%) 0.294 AndHighHigh 70.54 (3.8%) 70.12 (4.6%) -0.6% ( -8% - 8%) 0.657 Respell 78.30 (2.0%) 77.93 (2.1%) -0.5% ( -4% - 3%) 0.463 OrHighNotHigh 772.33 (5.0%) 768.86 (5.8%) -0.4% ( -10% - 10%) 0.795 Prefix3 133.26 (7.3%) 132.68 (8.8%) -0.4% ( -15% - 16%) 0.865 HighTermTitleBDVSort 189.02 (17.9%) 188.23 (12.7%) -0.4% ( -26% - 36%) 0.932 MedSpanNear 129.28 (2.6%) 129.09 (3.1%) -0.1% ( -5% - 5%) 0.871 OrNotHighLow 900.87 (3.4%) 900.01 (3.7%) -0.1% ( -6% - 7%) 0.932 LowPhrase 61.05 (2.7%) 61.00 (3.1%) -0.1% ( -5% - 5%) 0.918 HighSpanNear 96.65 (3.2%) 96.63 (3.3%) -0.0% ( -6% - 6%) 0.990 Fuzzy1 67.13 (6.9%) 67.15 (6.6%) 0.0% ( -12% - 14%) 0.988 OrHighNotMed 811.67 (4.9%) 812.18 (5.6%) 0.1% ( -9% - 11%) 0.969 BrowseMonthTaxoFacets 13.21 (2.8%) 13.22 (2.8%) 0.1% ( -5% - 5%) 0.941 HighPhrase 34.18 (3.1%) 34.21 (3.3%) 0.1% ( -6% - 6%) 0.939 AndHighLow 905.10 (4.0%) 905.96 (5.0%) 0.1% ( -8% - 9%) 0.947 MedPhrase 87.90 (2.8%) 88.10 (3.0%) 0.2% ( -5% - 6%) 0.811 BrowseDateTaxoFacets 11.06 (3.9%) 11.09 (3.4%) 0.3% ( -6% - 7%) 0.811 BrowseDayOfYearTaxoFacets 11.05 (3.8%) 11.08 (3.4%) 0.3% ( -6% - 7%) 0.801 MedSloppyPhrase 152.46 (3.1%) 152.89 (2.7%) 0.3% ( -5% - 6%) 0.757 PKLookup 215.89 (2.8%) 216.86 (3.8%) 0.5% ( -5% - 7%) 0.667 TermDTSort 436.33 (15.6%) 438.31 (13.8%) 0.5% ( -25% - 35%) 0.922 LowSpanNear 119.90 (2.4%) 120.46 (2.3%) 0.5% ( -4% - 5%) 0.533 HighSloppyPhrase 28.82 (3.9%) 28.99 (2.8%) 0.6% ( -5% - 7%) 0.586 AndHighMed 475.36 (5.6%) 478.26 (5.8%) 0.6% ( -10% - 12%) 0.735 LowSloppyPhrase 388.99 (3.4%) 392.32 (2.9%) 0.9% ( -5% - 7%) 0.387 OrNotHighMed 774.61 (6.6%) 781.75 (5.6%) 0.9% ( -10% - 14%) 0.633 HighTerm 1268.49 (5.6%) 1290.00 (5.6%) 1.7% ( -9% - 13%) 0.340 HighIntervalsOrdered 417.04 (3.1%) 425.09 (2.9%) 1.9% ( -3% - 8%) 0.043 MedTerm 1583.25 (5.4%) 1627.50 (5.5%) 2.8% ( -7% - 14%) 0.107 OrHighHigh 61.28 (3.6%) 64.46 (3.0%) 5.2% ( -1% - 12%) 0.000 OrHighMed 79.13 (2.9%) 85.68 (3.3%) 8.3% ( 1% - 14%) 0.000 OrHighLow 231.58 (4.7%) 683.73 (16.0%) 195.2% ( 166% - 226%) 0.000 ``` ``` full wikimedium5m run 2 TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value OrHighHigh 97.84 (2.7%) 78.42 (2.1%) -19.8% ( -24% - -15%) 0.000 HighTermTitleBDVSort 223.86 (17.8%) 217.70 (16.4%) -2.8% ( -31% - 38%) 0.611 OrNotHighLow 964.32 (2.6%) 945.18 (6.0%) -2.0% ( -10% - 6%) 0.175 OrHighNotLow 814.26 (5.8%) 799.46 (5.7%) -1.8% ( -12% - 10%) 0.316 HighTermMonthSort 342.78 (14.3%) 338.52 (15.6%) -1.2% ( -27% - 33%) 0.793 HighTermDayOfYearSort 259.90 (13.7%) 257.22 (13.8%) -1.0% ( -25% - 30%) 0.812 TermDTSort 234.69 (10.9%) 232.30 (12.3%) -1.0% ( -21% - 24%) 0.782 AndHighHigh 93.13 (3.0%) 92.19 (3.5%) -1.0% ( -7% - 5%) 0.326 MedTerm 1410.12 (3.9%) 1398.22 (2.4%) -0.8% ( -6% - 5%) 0.408 OrNotHighHigh 679.95 (6.4%) 674.81 (6.3%) -0.8% ( -12% - 12%) 0.706 OrHighNotMed 744.68 (4.4%) 739.05 (5.8%) -0.8% ( -10% - 9%) 0.644 AndHighMed 451.76 (3.8%) 448.59 (3.4%) -0.7% ( -7% - 6%) 0.540 AndHighLow 969.58 (5.6%) 963.88 (4.8%) -0.6% ( -10% - 10%) 0.720 LowSpanNear 25.23 (4.2%) 25.11 (2.9%) -0.5% ( -7% - 6%) 0.666 MedSpanNear 26.41 (2.4%) 26.33 (1.5%) -0.3% ( -4% - 3%) 0.610 HighIntervalsOrdered 37.09 (1.9%) 36.98 (2.4%) -0.3% ( -4% - 4%) 0.669 OrHighNotHigh 679.06 (4.3%) 677.17 (5.8%) -0.3% ( -9% - 10%) 0.863 HighSpanNear 32.19 (2.2%) 32.14 (2.1%) -0.2% ( -4% - 4%) 0.822 IntNRQ 322.43 (2.0%) 322.04 (2.5%) -0.1% ( -4% - 4%) 0.865 BrowseMonthSSDVFacets 32.22 (1.7%) 32.25 (1.5%) 0.1% ( -3% - 3%) 0.896 LowSloppyPhrase 39.45 (2.6%) 39.48 (2.4%) 0.1% ( -4% - 5%) 0.921 BrowseDayOfYearSSDVFacets 28.20 (5.4%) 28.23 (5.2%) 0.1% ( -9% - 11%) 0.947 HighSloppyPhrase 56.95 (2.4%) 57.03 (2.4%) 0.1% ( -4% - 4%) 0.846 PKLookup 217.45 (3.9%) 217.78 (4.2%) 0.2% ( -7% - 8%) 0.906 LowTerm 1614.00 (3.7%) 1616.52 (4.3%) 0.2% ( -7% - 8%) 0.902 MedSloppyPhrase 335.24 (2.8%) 336.50 (2.7%) 0.4% ( -4% - 6%) 0.665 MedPhrase 257.34 (2.7%) 258.59 (1.9%) 0.5% ( -4% - 5%) 0.515 HighPhrase 100.07 (2.1%) 100.66 (1.7%) 0.6% ( -3% - 4%) 0.332 BrowseDayOfYearTaxoFacets 11.20 (2.8%) 11.28 (2.5%) 0.7% ( -4% - 6%) 0.410 BrowseMonthTaxoFacets 13.07 (2.4%) 13.17 (1.9%) 0.7% ( -3% - 5%) 0.283 BrowseDateTaxoFacets 11.18 (2.9%) 11.27 (2.5%) 0.8% ( -4% - 6%) 0.369 Wildcard 55.50 (4.6%) 56.08 (2.9%) 1.0% ( -6% - 8%) 0.391 LowPhrase 501.30 (3.5%) 506.61 (3.2%) 1.1% ( -5% - 8%) 0.319 Prefix3 107.90 (6.5%) 109.16 (3.9%) 1.2% ( -8% - 12%) 0.491 Respell 73.30 (3.3%) 74.17 (2.6%) 1.2% ( -4% - 7%) 0.210 OrNotHighMed 625.05 (4.3%) 634.75 (4.9%) 1.6% ( -7% - 11%) 0.289 Fuzzy2 67.34 (18.7%) 68.92 (16.8%) 2.3% ( -27% - 46%) 0.677 HighTerm 1559.83 (4.6%) 1608.90 (5.3%) 3.1% ( -6% - 13%) 0.044 Fuzzy1 74.41 (17.1%) 77.02 (13.2%) 3.5% ( -22% - 40%) 0.467 OrHighMed 176.89 (4.0%) 192.17 (2.7%) 8.6% ( 1% - 16%) 0.000 OrHighLow 179.14 (3.0%) 634.97 (16.3%) 254.5% ( 228% - 282%) 0.000 ``` ``` full wikimedium5m run 3 TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value Fuzzy2 78.85 (17.1%) 74.79 (15.3%) -5.1% ( -32% - 32%) 0.315 Fuzzy1 73.72 (12.3%) 70.14 (9.6%) -4.9% ( -23% - 19%) 0.164 OrHighMed 218.87 (3.8%) 213.12 (3.9%) -2.6% ( -9% - 5%) 0.031 OrHighNotHigh 710.58 (5.0%) 693.73 (4.9%) -2.4% ( -11% - 7%) 0.130 OrHighNotLow 766.45 (7.0%) 752.36 (5.4%) -1.8% ( -13% - 11%) 0.351 OrHighNotMed 788.49 (4.6%) 779.76 (4.0%) -1.1% ( -9% - 7%) 0.415 MedSpanNear 432.51 (2.6%) 428.61 (2.9%) -0.9% ( -6% - 4%) 0.301 HighPhrase 328.27 (2.6%) 325.47 (3.1%) -0.9% ( -6% - 4%) 0.338 MedTerm 1537.24 (3.9%) 1525.49 (3.9%) -0.8% ( -8% - 7%) 0.537 PKLookup 224.01 (3.4%) 222.35 (3.2%) -0.7% ( -7% - 6%) 0.478 HighTerm 1852.48 (6.1%) 1839.68 (6.9%) -0.7% ( -12% - 13%) 0.737 OrNotHighLow 872.06 (4.3%) 866.35 (3.3%) -0.7% ( -7% - 7%) 0.589 OrNotHighHigh 696.91 (4.9%) 694.25 (5.3%) -0.4% ( -10% - 10%) 0.814 AndHighMed 399.43 (3.7%) 398.38 (3.4%) -0.3% ( -7% - 7%) 0.818 BrowseMonthTaxoFacets 13.35 (2.5%) 13.33 (2.8%) -0.1% ( -5% - 5%) 0.891 BrowseMonthSSDVFacets 31.99 (2.2%) 31.97 (2.3%) -0.1% ( -4% - 4%) 0.917 HighIntervalsOrdered 56.92 (1.7%) 56.89 (1.5%) -0.1% ( -3% - 3%) 0.916 MedPhrase 421.85 (2.6%) 421.64 (2.4%) -0.1% ( -4% - 5%) 0.949 LowSpanNear 215.84 (1.5%) 215.81 (1.9%) -0.0% ( -3% - 3%) 0.975 BrowseDayOfYearTaxoFacets 11.13 (3.0%) 11.13 (3.2%) -0.0% ( -6% - 6%) 0.992 BrowseDayOfYearSSDVFacets 27.51 (8.3%) 27.52 (8.1%) 0.0% ( -15% - 17%) 0.994 HighSpanNear 16.99 (2.2%) 16.99 (2.1%) 0.0% ( -4% - 4%) 0.968 BrowseDateTaxoFacets 11.11 (3.0%) 11.11 (3.3%) 0.0% ( -6% - 6%) 0.977 Wildcard 259.96 (2.3%) 260.11 (2.7%) 0.1% ( -4% - 5%) 0.943 HighTermTitleBDVSort 216.56 (6.9%) 216.79 (7.9%) 0.1% ( -13% - 15%) 0.964 LowSloppyPhrase 36.16 (3.5%) 36.20 (3.8%) 0.1% ( -6% - 7%) 0.922 LowTerm 1653.62 (6.1%) 1656.23 (4.8%) 0.2% ( -10% - 11%) 0.928 TermDTSort 236.21 (14.9%) 236.69 (14.7%) 0.2% ( -25% - 34%) 0.965 OrNotHighMed 738.85 (3.6%) 741.27 (4.7%) 0.3% ( -7% - 9%) 0.806 IntNRQ 122.68 (1.1%) 123.17 (0.8%) 0.4% ( -1% - 2%) 0.210 Respell 75.86 (2.4%) 76.22 (2.0%) 0.5% ( -3% - 5%) 0.505 HighSloppyPhrase 80.85 (3.7%) 81.25 (4.6%) 0.5% ( -7% - 9%) 0.708 MedSloppyPhrase 31.20 (3.5%) 31.39 (4.3%) 0.6% ( -6% - 8%) 0.628 HighTermMonthSort 396.29 (8.2%) 398.90 (9.3%) 0.7% ( -15% - 19%) 0.812 Prefix3 393.10 (2.7%) 396.20 (2.5%) 0.8% ( -4% - 6%) 0.339 AndHighHigh 105.61 (3.7%) 106.69 (4.0%) 1.0% ( -6% - 9%) 0.399 LowPhrase 61.52 (2.1%) 62.17 (3.2%) 1.1% ( -4% - 6%) 0.221 AndHighLow 915.63 (4.3%) 928.98 (3.1%) 1.5% ( -5% - 9%) 0.217 HighTermDayOfYearSort 216.71 (14.0%) 220.00 (15.9%) 1.5% ( -24% - 36%) 0.749 OrHighLow 535.18 (7.4%) 571.87 (5.8%) 6.9% ( -5% - 21%) 0.001 OrHighHigh 51.30 (2.8%) 56.55 (2.7%) 10.2% ( 4% - 16%) 0.000 ``` So far the implementation seems to be similar to the baseline WANDScorer, with the surprising occasional huge speed up or `OrHighLow`. Hopefully this is not caused by a bug :D . I think this performance characteristics makes sense, as the low frequency / high score contribution term would drive the iteration, and a big window size would cause more docs to be pruned quickly if it can't be competitive from their maxScores. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org