jpountz opened a new pull request, #13636:
URL: https://github.com/apache/lucene/pull/13636
Our postings use a layout that helps take advantage of Java's
auto-vectorization to be reasonably fast to decode. But we can make it a bit
faster by using explicit vectorization on MemorySegment:
- vectorizing directly from the MemorySegment instead of first copying data
into a long[],
- decoding more longs than requested instead of forcing the last longs to
be handled via scalar instructions.
This approach only works when the `Directory` uses `MemorySegmentIndexInput`
under the hood, ie. `MMapDirectory` on JDK 21+. The `ForUtilBenchmark` micro
benchmark reports the following results:
```
Before
Benchmark (bpv) Mode Cnt Score Error
Units
ForUtilBenchmark.decode 5 thrpt 15 36.244 ± 0.742
ops/us
ForUtilBenchmark.decode 6 thrpt 15 35.406 ± 0.170
ops/us
ForUtilBenchmark.decode 7 thrpt 15 33.396 ± 0.291
ops/us
ForUtilBenchmark.decode 8 thrpt 15 41.064 ± 2.269
ops/us
ForUtilBenchmark.decode 9 thrpt 15 30.288 ± 0.172
ops/us
ForUtilBenchmark.decode 10 thrpt 15 31.203 ± 0.791
ops/us
ForUtilBenchmark.decodeAndPrefixSum 5 thrpt 15 19.421 ± 0.272
ops/us
ForUtilBenchmark.decodeAndPrefixSum 6 thrpt 15 18.932 ± 0.356
ops/us
ForUtilBenchmark.decodeAndPrefixSum 7 thrpt 15 16.824 ± 1.080
ops/us
ForUtilBenchmark.decodeAndPrefixSum 8 thrpt 15 21.085 ± 0.316
ops/us
ForUtilBenchmark.decodeAndPrefixSum 9 thrpt 15 15.874 ± 2.085
ops/us
ForUtilBenchmark.decodeAndPrefixSum 10 thrpt 15 17.827 ± 0.210
ops/us
After
Benchmark (bpv) Mode Cnt Score Error
Units
ForUtilBenchmark.decode 5 thrpt 15 40.774 ± 1.170
ops/us
ForUtilBenchmark.decode 6 thrpt 15 44.392 ± 0.748
ops/us
ForUtilBenchmark.decode 7 thrpt 15 43.050 ± 0.586
ops/us
ForUtilBenchmark.decode 8 thrpt 15 49.773 ± 0.376
ops/us
ForUtilBenchmark.decode 9 thrpt 15 36.264 ± 0.434
ops/us
ForUtilBenchmark.decode 10 thrpt 15 38.403 ± 1.388
ops/us
ForUtilBenchmark.decodeAndPrefixSum 5 thrpt 15 19.362 ± 0.573
ops/us
ForUtilBenchmark.decodeAndPrefixSum 6 thrpt 15 18.402 ± 3.128
ops/us
ForUtilBenchmark.decodeAndPrefixSum 7 thrpt 15 19.518 ± 0.680
ops/us
ForUtilBenchmark.decodeAndPrefixSum 8 thrpt 15 21.388 ± 0.228
ops/us
ForUtilBenchmark.decodeAndPrefixSum 9 thrpt 15 18.126 ± 0.625
ops/us
ForUtilBenchmark.decodeAndPrefixSum 10 thrpt 15 19.161 ± 0.379
ops/us
```
And `luceneutil` on `wikibigall` reports the following (only look at queries
with low p-values):
```
TaskQPS baseline StdDevQPS
my_modified_version StdDev Pct diff p-value
IntNRQ 132.60 (23.4%) 129.72
(24.1%) -2.2% ( -40% - 59%) 0.772
CountTerm 9419.14 (3.5%) 9377.10
(3.1%) -0.4% ( -6% - 6%) 0.669
Respell 55.22 (1.2%) 55.24
(1.5%) 0.0% ( -2% - 2%) 0.931
Fuzzy2 71.83 (1.1%) 71.87
(1.1%) 0.1% ( -2% - 2%) 0.871
TermDTSort 360.85 (6.8%) 361.40
(5.6%) 0.2% ( -11% - 13%) 0.939
HighTermMonthSort 3491.38 (1.4%) 3499.56
(2.4%) 0.2% ( -3% - 4%) 0.702
Fuzzy1 89.72 (1.1%) 90.06
(1.2%) 0.4% ( -1% - 2%) 0.285
MedSloppyPhrase 7.70 (3.6%) 7.73
(6.6%) 0.4% ( -9% - 10%) 0.791
PKLookup 288.99 (1.7%) 290.39
(1.5%) 0.5% ( -2% - 3%) 0.332
HighTermDayOfYearSort 829.90 (3.4%) 836.92
(3.4%) 0.8% ( -5% - 7%) 0.436
HighPhrase 66.40 (2.8%) 66.97
(5.1%) 0.9% ( -6% - 9%) 0.508
Prefix3 185.20 (3.1%) 186.81
(2.8%) 0.9% ( -4% - 7%) 0.352
HighTermTitleSort 162.00 (4.1%) 163.43
(5.0%) 0.9% ( -7% - 10%) 0.543
HighSloppyPhrase 6.91 (3.0%) 6.98
(6.6%) 1.0% ( -8% - 10%) 0.521
Wildcard 53.89 (3.9%) 54.64
(3.5%) 1.4% ( -5% - 9%) 0.237
OrHighRare 252.56 (9.3%) 257.10
(9.6%) 1.8% ( -15% - 22%) 0.548
MedIntervalsOrdered 3.33 (4.7%) 3.40
(5.6%) 1.9% ( -8% - 12%) 0.259
HighSpanNear 4.60 (1.5%) 4.69
(1.4%) 1.9% ( -1% - 4%) 0.000
OrNotHighHigh 221.99 (3.7%) 226.47
(5.4%) 2.0% ( -6% - 11%) 0.164
OrHighLow 812.48 (2.0%) 828.98
(2.3%) 2.0% ( -2% - 6%) 0.003
HighIntervalsOrdered 1.36 (5.3%) 1.39
(6.4%) 2.0% ( -9% - 14%) 0.272
LowIntervalsOrdered 4.38 (4.1%) 4.47
(4.5%) 2.1% ( -6% - 11%) 0.130
OrHighNotHigh 207.32 (4.1%) 211.83
(5.6%) 2.2% ( -7% - 12%) 0.159
OrNotHighMed 334.67 (3.7%) 342.02
(5.8%) 2.2% ( -7% - 12%) 0.152
MedSpanNear 14.99 (1.8%) 15.32
(1.4%) 2.2% ( -1% - 5%) 0.000
MedPhrase 14.61 (3.2%) 14.97
(4.6%) 2.5% ( -5% - 10%) 0.052
LowPhrase 71.73 (3.3%) 73.49
(4.4%) 2.5% ( -5% - 10%) 0.046
AndHighHigh 70.54 (2.0%) 72.44
(1.3%) 2.7% ( 0% - 6%) 0.000
OrHighHigh 67.00 (1.7%) 68.80
(1.6%) 2.7% ( 0% - 6%) 0.000
OrHighMed 191.06 (1.9%) 196.57
(1.8%) 2.9% ( 0% - 6%) 0.000
LowSloppyPhrase 24.50 (2.9%) 25.27
(4.1%) 3.2% ( -3% - 10%) 0.005
AndHighMed 151.19 (2.2%) 156.00
(1.4%) 3.2% ( 0% - 6%) 0.000
CountPhrase 3.18 (9.0%) 3.29
(9.1%) 3.3% ( -13% - 23%) 0.244
Or2Terms2StopWords 160.27 (4.4%) 165.68
(1.5%) 3.4% ( -2% - 9%) 0.001
And2Terms2StopWords 157.18 (2.8%) 162.48
(1.4%) 3.4% ( 0% - 7%) 0.000
OrHighNotMed 301.42 (4.1%) 311.77
(6.1%) 3.4% ( -6% - 14%) 0.038
LowSpanNear 9.83 (1.5%) 10.17
(1.4%) 3.5% ( 0% - 6%) 0.000
CountAndHighHigh 46.76 (1.3%) 48.48
(2.3%) 3.7% ( 0% - 7%) 0.000
HighTermTitleBDVSort 11.40 (5.6%) 11.82
(8.1%) 3.7% ( -9% - 18%) 0.092
OrHighNotLow 342.09 (4.6%) 355.83
(7.0%) 4.0% ( -7% - 16%) 0.033
And3Terms 165.26 (3.0%) 171.96
(1.6%) 4.1% ( 0% - 8%) 0.000
Or3Terms 165.25 (4.5%) 171.96
(1.5%) 4.1% ( -1% - 10%) 0.000
AndHighLow 993.19 (2.8%) 1034.27
(2.8%) 4.1% ( -1% - 10%) 0.000
OrNotHighLow 989.09 (3.1%) 1030.58
(3.5%) 4.2% ( -2% - 11%) 0.000
AndStopWords 29.62 (4.1%) 30.97
(1.9%) 4.6% ( -1% - 11%) 0.000
OrStopWords 32.89 (7.0%) 34.41
(2.6%) 4.6% ( -4% - 15%) 0.006
LowTerm 978.60 (3.0%) 1023.81
(6.1%) 4.6% ( -4% - 14%) 0.002
CountAndHighMed 140.50 (1.6%) 147.18
(2.6%) 4.8% ( 0% - 9%) 0.000
CountOrHighHigh 57.94 (15.1%) 60.85
(16.7%) 5.0% ( -23% - 43%) 0.319
CountOrHighMed 113.79 (11.2%) 120.21
(13.2%) 5.6% ( -16% - 33%) 0.145
HighTerm 363.62 (5.0%) 384.13
(8.2%) 5.6% ( -7% - 19%) 0.009
MedTerm 546.01 (4.2%) 580.61
(7.9%) 6.3% ( -5% - 19%) 0.002
```
### Description
<!--
If this is your first contribution to Lucene, please make sure you have
reviewed the contribution guide.
https://github.com/apache/lucene/blob/main/CONTRIBUTING.md
-->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]