zacharymorn commented on PR #972: URL: https://github.com/apache/lucene/pull/972#issuecomment-1163861983
Thanks @jpountz for looking into this! I did further experiments on this and the result seems to suggest it may be caused by bug / caching in the util or lucene itself. What I did was I first only kept 1 query per pure disjunction task, and remove the rest of the tasks like below. ``` OrHighHigh: several following # freq=436129 freq=416515 OrHighMed: international chris # freq=418261 freq=85523 OrHighLow: 2005 valois # freq=835460 freq=2277 ``` and got this result: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value PKLookup 189.27 (23.2%) 220.18 (16.1%) 16.3% ( -18% - 72%) 0.010 OrHighHigh 10.37 (41.8%) 20.92 (94.9%) 101.8% ( -24% - 410%) 0.000 OrHighMed 21.43 (54.3%) 56.18 (138.5%) 162.2% ( -19% - 777%) 0.000 OrHighLow 138.14 (26.2%) 368.50 (91.9%) 166.7% ( 38% - 385%) 0.000 ``` However, when I added back the rest of the tasks but still kept 1 query for each of the three disjunction tasks, I got vastly different results: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value OrHighHigh 48.31 (7.8%) 38.21 (5.6%) -20.9% ( -31% - -8%) 0.000 OrHighMed 152.23 (8.5%) 140.16 (8.5%) -7.9% ( -23% - 9%) 0.003 BrowseDayOfYearSSDVFacets 21.67 (12.4%) 20.84 (7.1%) -3.8% ( -20% - 17%) 0.231 MedPhrase 103.08 (5.3%) 102.43 (9.1%) -0.6% ( -14% - 14%) 0.790 TermDTSort 162.97 (10.6%) 162.10 (6.4%) -0.5% ( -15% - 18%) 0.847 HighSloppyPhrase 62.12 (7.3%) 61.99 (5.7%) -0.2% ( -12% - 13%) 0.921 OrHighNotMed 1216.05 (4.1%) 1216.03 (3.2%) -0.0% ( -7% - 7%) 0.999 HighTerm 2088.45 (4.2%) 2091.84 (3.5%) 0.2% ( -7% - 8%) 0.895 BrowseMonthSSDVFacets 23.34 (10.0%) 23.49 (11.0%) 0.6% ( -18% - 24%) 0.845 MedTerm 3189.76 (3.5%) 3215.19 (3.9%) 0.8% ( -6% - 8%) 0.497 AndHighHigh 59.64 (6.9%) 60.14 (6.3%) 0.8% ( -11% - 15%) 0.688 MedSpanNear 46.86 (5.9%) 47.26 (7.8%) 0.9% ( -12% - 15%) 0.692 HighTermTitleBDVSort 125.65 (7.5%) 126.83 (10.1%) 0.9% ( -15% - 20%) 0.738 LowPhrase 21.25 (4.2%) 21.62 (4.4%) 1.8% ( -6% - 10%) 0.194 BrowseRandomLabelSSDVFacets 15.38 (6.6%) 15.66 (10.1%) 1.8% ( -13% - 19%) 0.509 HighSpanNear 20.48 (5.6%) 20.86 (5.1%) 1.9% ( -8% - 13%) 0.270 HighTermDayOfYearSort 187.51 (7.6%) 191.07 (9.4%) 1.9% ( -14% - 20%) 0.482 OrHighNotLow 1505.18 (10.5%) 1535.43 (4.4%) 2.0% ( -11% - 18%) 0.431 LowSloppyPhrase 233.02 (6.9%) 237.95 (8.4%) 2.1% ( -12% - 18%) 0.383 MedIntervalsOrdered 18.37 (5.0%) 18.77 (5.3%) 2.2% ( -7% - 13%) 0.177 OrNotHighMed 1310.81 (4.0%) 1342.33 (5.1%) 2.4% ( -6% - 11%) 0.096 MedSloppyPhrase 40.20 (6.2%) 41.18 (5.7%) 2.4% ( -8% - 15%) 0.190 AndHighHighDayTaxoFacets 13.90 (5.6%) 14.25 (3.7%) 2.6% ( -6% - 12%) 0.090 AndHighMed 566.84 (5.8%) 582.07 (5.7%) 2.7% ( -8% - 15%) 0.138 OrNotHighHigh 1976.63 (5.0%) 2030.44 (3.5%) 2.7% ( -5% - 11%) 0.044 Fuzzy2 50.72 (10.3%) 52.17 (7.9%) 2.9% ( -13% - 23%) 0.325 MedTermDayTaxoFacets 79.53 (6.7%) 81.86 (7.5%) 2.9% ( -10% - 18%) 0.192 OrHighNotHigh 1169.61 (6.6%) 1204.25 (3.8%) 3.0% ( -6% - 14%) 0.080 HighPhrase 400.95 (5.5%) 413.11 (2.2%) 3.0% ( -4% - 11%) 0.022 OrHighMedDayTaxoFacets 13.10 (5.0%) 13.50 (5.6%) 3.1% ( -7% - 14%) 0.066 AndHighMedDayTaxoFacets 52.75 (6.4%) 54.42 (6.3%) 3.2% ( -8% - 16%) 0.115 OrNotHighLow 2842.51 (3.5%) 2935.45 (5.5%) 3.3% ( -5% - 12%) 0.025 LowTerm 3032.44 (3.7%) 3140.83 (3.0%) 3.6% ( -3% - 10%) 0.001 HighTermMonthSort 210.58 (11.4%) 218.63 (11.3%) 3.8% ( -16% - 29%) 0.286 Fuzzy1 135.95 (8.1%) 141.35 (9.6%) 4.0% ( -12% - 23%) 0.158 IntNRQ 365.48 (17.1%) 380.05 (12.1%) 4.0% ( -21% - 40%) 0.395 Prefix3 72.71 (11.4%) 75.69 (9.5%) 4.1% ( -15% - 28%) 0.217 Wildcard 333.86 (9.4%) 347.59 (10.5%) 4.1% ( -14% - 26%) 0.192 LowSpanNear 40.56 (4.1%) 42.49 (6.4%) 4.8% ( -5% - 15%) 0.005 Respell 56.68 (7.6%) 59.68 (5.5%) 5.3% ( -7% - 19%) 0.012 AndHighLow 1062.10 (6.2%) 1118.34 (5.9%) 5.3% ( -6% - 18%) 0.005 HighIntervalsOrdered 9.10 (4.5%) 9.64 (6.5%) 5.9% ( -4% - 17%) 0.001 LowIntervalsOrdered 15.02 (5.1%) 15.96 (5.2%) 6.2% ( -3% - 17%) 0.000 BrowseDateSSDVFacets 2.29 (6.5%) 2.44 (19.1%) 6.4% ( -18% - 34%) 0.158 OrHighLow 931.34 (4.9%) 995.65 (3.7%) 6.9% ( -1% - 16%) 0.000 PKLookup 244.93 (10.0%) 262.86 (11.4%) 7.3% ( -12% - 31%) 0.031 BrowseMonthTaxoFacets 23.30 (42.2%) 28.97 (50.4%) 24.3% ( -47% - 202%) 0.098 BrowseDateTaxoFacets 23.39 (44.9%) 29.54 (52.5%) 26.3% ( -49% - 224%) 0.089 BrowseDayOfYearTaxoFacets 23.38 (44.2%) 30.09 (54.9%) 28.7% ( -48% - 228%) 0.069 BrowseRandomLabelTaxoFacets 22.47 (55.6%) 29.49 (68.4%) 31.2% ( -59% - 349%) 0.113 ``` I've attached the tasks file for reference here as well [wikimedium.10M.nostopwords.tasks.txt](https://github.com/apache/lucene/files/8962486/wikimedium.10M.nostopwords.tasks.txt) @mikemccand , do you have any suggestion where this discrepancy might be coming from? I'll continue to run experiments as well to see if I can pinpoint the issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org