original-brownbear commented on PR #13472:
URL: https://github.com/apache/lucene/pull/13472#issuecomment-2176505152
I reran the benchmarks of no concurrency vs 4 threads and constrained the
page cache a lot by setting -Xmx to almost all of the machines memory (page
cache size goes to about 1.5G). As somewhat expected, in this scenario
concurrency helps a lot.
```
TaskQPS baseline StdDevQPS
my_modified_version StdDev Pct diff p-value
PKLookup 85.93 (34.3%) 3.99
(1.4%) -95.4% ( -97% - -90%) 0.000
Fuzzy1 101.17 (2.9%) 8.18
(8.3%) -91.9% (-100% - -83%) 0.000
AndHighLow 1044.32 (2.5%) 85.54
(0.2%) -91.8% ( -92% - -91%) 0.000
Fuzzy2 47.18 (16.5%) 4.91
(0.9%) -89.6% ( -91% - -86%) 0.000
LowSloppyPhrase 45.24 (3.3%) 4.85
(0.3%) -89.3% ( -89% - -88%) 0.000
MedPhrase 40.91 (1.6%) 5.12
(2.2%) -87.5% ( -89% - -85%) 0.000
HighPhrase 87.72 (1.4%) 11.00
(0.1%) -87.5% ( -87% - -87%) 0.000
HighTermTitleBDVSort 13.88 (1.6%) 1.78
(0.1%) -87.2% ( -87% - -86%) 0.000
OrNotHighLow 1688.98 (6.8%) 221.97
(15.1%) -86.9% (-101% - -69%) 0.000
Prefix3 557.04 (17.4%) 73.35
(2.6%) -86.8% ( -90% - -80%) 0.000
Wildcard 30.67 (7.3%) 4.55
(3.8%) -85.2% ( -89% - -79%) 0.000
Respell 32.29 (40.9%) 4.90
(14.7%) -84.8% ( -99% - -49%) 0.000
LowSpanNear 25.42 (0.8%) 3.98
(2.7%) -84.4% ( -87% - -81%) 0.000
MedSloppyPhrase 57.73 (2.6%) 9.15
(1.3%) -84.2% ( -85% - -82%) 0.000
LowPhrase 169.09 (0.7%) 28.85
(13.2%) -82.9% ( -96% - -69%) 0.000
LowIntervalsOrdered 14.27 (0.8%) 2.44
(3.6%) -82.9% ( -86% - -79%) 0.000
IntNRQ 97.33 (2.6%) 18.62
(0.9%) -80.9% ( -82% - -79%) 0.000
MedSpanNear 27.71 (0.8%) 5.96
(4.1%) -78.5% ( -82% - -74%) 0.000
MedIntervalsOrdered 5.63 (0.9%) 1.49
(5.1%) -73.6% ( -78% - -68%) 0.000
HighIntervalsOrdered 4.11 (1.0%) 1.09
(0.9%) -73.6% ( -74% - -72%) 0.000
HighSloppyPhrase 16.75 (1.8%) 4.68
(2.3%) -72.1% ( -74% - -69%) 0.000
HighSpanNear 7.44 (1.3%) 2.13
(2.8%) -71.3% ( -74% - -68%) 0.000
OrNotHighMed 278.73 (2.2%) 85.43
(2.8%) -69.4% ( -72% - -65%) 0.000
BrowseDateTaxoFacets 15.30 (2.3%) 4.80
(1.7%) -68.6% ( -71% - -66%) 0.000
BrowseDayOfYearTaxoFacets 14.91 (11.4%) 5.52
(2.4%) -62.9% ( -68% - -55%) 0.000
BrowseRandomLabelTaxoFacets 11.38 (0.5%) 4.28
(1.5%) -62.4% ( -64% - -60%) 0.000
OrHighMed 129.46 (4.2%) 48.97
(8.7%) -62.2% ( -72% - -51%) 0.000
AndHighHigh 65.35 (1.0%) 25.98
(6.2%) -60.2% ( -66% - -53%) 0.000
AndHighMed 76.79 (1.3%) 31.63
(9.5%) -58.8% ( -68% - -48%) 0.000
OrHighHigh 42.48 (14.7%) 18.32
(1.5%) -56.9% ( -63% - -47%) 0.000
OrHighNotLow 144.92 (1.7%) 64.99
(0.8%) -55.2% ( -56% - -53%) 0.000
MedTerm 284.66 (25.9%) 152.10
(3.4%) -46.6% ( -60% - -23%) 0.000
AndHighMedDayTaxoFacets 108.26 (2.1%) 63.19
(2.1%) -41.6% ( -44% - -38%) 0.000
BrowseDayOfYearSSDVFacets 7.91 (5.5%) 4.67
(1.5%) -40.9% ( -45% - -35%) 0.000
OrNotHighHigh 155.25 (9.7%) 93.97
(2.8%) -39.5% ( -47% - -29%) 0.000
OrHighLow 176.36 (55.0%) 123.29
(59.2%) -30.1% ( -93% - 186%) 0.405
BrowseDateSSDVFacets 1.53 (2.5%) 1.27
(8.8%) -17.3% ( -27% - -6%) 0.000
MedTermDayTaxoFacets 21.73 (1.6%) 18.02
(2.8%) -17.0% ( -21% - -12%) 0.000
LowTerm 279.50 (24.2%) 234.75
(14.9%) -16.0% ( -44% - 30%) 0.208
OrHighNotHigh 70.74 (25.6%) 60.08
(54.8%) -15.1% ( -76% - 87%) 0.578
HighTermTitleSort 23.23 (1.5%) 19.88
(16.3%) -14.4% ( -31% - 3%) 0.048
AndHighHighDayTaxoFacets 7.27 (1.3%) 6.80
(4.5%) -6.5% ( -12% - 0%) 0.002
BrowseMonthTaxoFacets 12.85 (24.3%) 12.34
(28.6%) -4.0% ( -45% - 64%) 0.813
BrowseMonthSSDVFacets 5.17 (1.7%) 4.99
(2.2%) -3.4% ( -7% - 0%) 0.007
BrowseRandomLabelSSDVFacets 3.72 (0.5%) 3.81
(14.4%) 2.5% ( -12% - 17%) 0.696
OrHighMedDayTaxoFacets 4.74 (1.7%) 4.89
(4.6%) 3.3% ( -2% - 9%) 0.134
HighTermMonthSort 279.05 (33.6%) 307.51
(140.4%) 10.2% (-122% - 277%) 0.874
OrHighNotMed 120.83 (1.2%) 154.43
(84.5%) 27.8% ( -57% - 114%) 0.462
TermDTSort 59.06 (4.9%) 81.90
(63.8%) 38.7% ( -28% - 113%) 0.177
HighTermDayOfYearSort 55.42 (41.8%) 77.40
(19.2%) 39.7% ( -15% - 173%) 0.054
HighTerm 124.75 (7.0%) 496.22
(33.9%) 297.8% ( 239% - 364%) 0.000
```
Seems to me CPU bound scenarios don't parallelise well and cache misses just
introduce bottle-necks that more than outweigh the benefits of speeding up the
parallel sections of a search, but as soon as significant disk IO comes into
play the situation reverses as one would expect. Who cares about stalling on
RAM reads while stalling on disk frequently I guess? :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]