jpountz opened a new pull request, #13636: URL: https://github.com/apache/lucene/pull/13636
Our postings use a layout that helps take advantage of Java's auto-vectorization to be reasonably fast to decode. But we can make it a bit faster by using explicit vectorization on MemorySegment: - vectorizing directly from the MemorySegment instead of first copying data into a long[], - decoding more longs than requested instead of forcing the last longs to be handled via scalar instructions. This approach only works when the `Directory` uses `MemorySegmentIndexInput` under the hood, ie. `MMapDirectory` on JDK 21+. The `ForUtilBenchmark` micro benchmark reports the following results: ``` Before Benchmark (bpv) Mode Cnt Score Error Units ForUtilBenchmark.decode 5 thrpt 15 36.244 ± 0.742 ops/us ForUtilBenchmark.decode 6 thrpt 15 35.406 ± 0.170 ops/us ForUtilBenchmark.decode 7 thrpt 15 33.396 ± 0.291 ops/us ForUtilBenchmark.decode 8 thrpt 15 41.064 ± 2.269 ops/us ForUtilBenchmark.decode 9 thrpt 15 30.288 ± 0.172 ops/us ForUtilBenchmark.decode 10 thrpt 15 31.203 ± 0.791 ops/us ForUtilBenchmark.decodeAndPrefixSum 5 thrpt 15 19.421 ± 0.272 ops/us ForUtilBenchmark.decodeAndPrefixSum 6 thrpt 15 18.932 ± 0.356 ops/us ForUtilBenchmark.decodeAndPrefixSum 7 thrpt 15 16.824 ± 1.080 ops/us ForUtilBenchmark.decodeAndPrefixSum 8 thrpt 15 21.085 ± 0.316 ops/us ForUtilBenchmark.decodeAndPrefixSum 9 thrpt 15 15.874 ± 2.085 ops/us ForUtilBenchmark.decodeAndPrefixSum 10 thrpt 15 17.827 ± 0.210 ops/us After Benchmark (bpv) Mode Cnt Score Error Units ForUtilBenchmark.decode 5 thrpt 15 40.774 ± 1.170 ops/us ForUtilBenchmark.decode 6 thrpt 15 44.392 ± 0.748 ops/us ForUtilBenchmark.decode 7 thrpt 15 43.050 ± 0.586 ops/us ForUtilBenchmark.decode 8 thrpt 15 49.773 ± 0.376 ops/us ForUtilBenchmark.decode 9 thrpt 15 36.264 ± 0.434 ops/us ForUtilBenchmark.decode 10 thrpt 15 38.403 ± 1.388 ops/us ForUtilBenchmark.decodeAndPrefixSum 5 thrpt 15 19.362 ± 0.573 ops/us ForUtilBenchmark.decodeAndPrefixSum 6 thrpt 15 18.402 ± 3.128 ops/us ForUtilBenchmark.decodeAndPrefixSum 7 thrpt 15 19.518 ± 0.680 ops/us ForUtilBenchmark.decodeAndPrefixSum 8 thrpt 15 21.388 ± 0.228 ops/us ForUtilBenchmark.decodeAndPrefixSum 9 thrpt 15 18.126 ± 0.625 ops/us ForUtilBenchmark.decodeAndPrefixSum 10 thrpt 15 19.161 ± 0.379 ops/us ``` And `luceneutil` on `wikibigall` reports the following (only look at queries with low p-values): ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value IntNRQ 132.60 (23.4%) 129.72 (24.1%) -2.2% ( -40% - 59%) 0.772 CountTerm 9419.14 (3.5%) 9377.10 (3.1%) -0.4% ( -6% - 6%) 0.669 Respell 55.22 (1.2%) 55.24 (1.5%) 0.0% ( -2% - 2%) 0.931 Fuzzy2 71.83 (1.1%) 71.87 (1.1%) 0.1% ( -2% - 2%) 0.871 TermDTSort 360.85 (6.8%) 361.40 (5.6%) 0.2% ( -11% - 13%) 0.939 HighTermMonthSort 3491.38 (1.4%) 3499.56 (2.4%) 0.2% ( -3% - 4%) 0.702 Fuzzy1 89.72 (1.1%) 90.06 (1.2%) 0.4% ( -1% - 2%) 0.285 MedSloppyPhrase 7.70 (3.6%) 7.73 (6.6%) 0.4% ( -9% - 10%) 0.791 PKLookup 288.99 (1.7%) 290.39 (1.5%) 0.5% ( -2% - 3%) 0.332 HighTermDayOfYearSort 829.90 (3.4%) 836.92 (3.4%) 0.8% ( -5% - 7%) 0.436 HighPhrase 66.40 (2.8%) 66.97 (5.1%) 0.9% ( -6% - 9%) 0.508 Prefix3 185.20 (3.1%) 186.81 (2.8%) 0.9% ( -4% - 7%) 0.352 HighTermTitleSort 162.00 (4.1%) 163.43 (5.0%) 0.9% ( -7% - 10%) 0.543 HighSloppyPhrase 6.91 (3.0%) 6.98 (6.6%) 1.0% ( -8% - 10%) 0.521 Wildcard 53.89 (3.9%) 54.64 (3.5%) 1.4% ( -5% - 9%) 0.237 OrHighRare 252.56 (9.3%) 257.10 (9.6%) 1.8% ( -15% - 22%) 0.548 MedIntervalsOrdered 3.33 (4.7%) 3.40 (5.6%) 1.9% ( -8% - 12%) 0.259 HighSpanNear 4.60 (1.5%) 4.69 (1.4%) 1.9% ( -1% - 4%) 0.000 OrNotHighHigh 221.99 (3.7%) 226.47 (5.4%) 2.0% ( -6% - 11%) 0.164 OrHighLow 812.48 (2.0%) 828.98 (2.3%) 2.0% ( -2% - 6%) 0.003 HighIntervalsOrdered 1.36 (5.3%) 1.39 (6.4%) 2.0% ( -9% - 14%) 0.272 LowIntervalsOrdered 4.38 (4.1%) 4.47 (4.5%) 2.1% ( -6% - 11%) 0.130 OrHighNotHigh 207.32 (4.1%) 211.83 (5.6%) 2.2% ( -7% - 12%) 0.159 OrNotHighMed 334.67 (3.7%) 342.02 (5.8%) 2.2% ( -7% - 12%) 0.152 MedSpanNear 14.99 (1.8%) 15.32 (1.4%) 2.2% ( -1% - 5%) 0.000 MedPhrase 14.61 (3.2%) 14.97 (4.6%) 2.5% ( -5% - 10%) 0.052 LowPhrase 71.73 (3.3%) 73.49 (4.4%) 2.5% ( -5% - 10%) 0.046 AndHighHigh 70.54 (2.0%) 72.44 (1.3%) 2.7% ( 0% - 6%) 0.000 OrHighHigh 67.00 (1.7%) 68.80 (1.6%) 2.7% ( 0% - 6%) 0.000 OrHighMed 191.06 (1.9%) 196.57 (1.8%) 2.9% ( 0% - 6%) 0.000 LowSloppyPhrase 24.50 (2.9%) 25.27 (4.1%) 3.2% ( -3% - 10%) 0.005 AndHighMed 151.19 (2.2%) 156.00 (1.4%) 3.2% ( 0% - 6%) 0.000 CountPhrase 3.18 (9.0%) 3.29 (9.1%) 3.3% ( -13% - 23%) 0.244 Or2Terms2StopWords 160.27 (4.4%) 165.68 (1.5%) 3.4% ( -2% - 9%) 0.001 And2Terms2StopWords 157.18 (2.8%) 162.48 (1.4%) 3.4% ( 0% - 7%) 0.000 OrHighNotMed 301.42 (4.1%) 311.77 (6.1%) 3.4% ( -6% - 14%) 0.038 LowSpanNear 9.83 (1.5%) 10.17 (1.4%) 3.5% ( 0% - 6%) 0.000 CountAndHighHigh 46.76 (1.3%) 48.48 (2.3%) 3.7% ( 0% - 7%) 0.000 HighTermTitleBDVSort 11.40 (5.6%) 11.82 (8.1%) 3.7% ( -9% - 18%) 0.092 OrHighNotLow 342.09 (4.6%) 355.83 (7.0%) 4.0% ( -7% - 16%) 0.033 And3Terms 165.26 (3.0%) 171.96 (1.6%) 4.1% ( 0% - 8%) 0.000 Or3Terms 165.25 (4.5%) 171.96 (1.5%) 4.1% ( -1% - 10%) 0.000 AndHighLow 993.19 (2.8%) 1034.27 (2.8%) 4.1% ( -1% - 10%) 0.000 OrNotHighLow 989.09 (3.1%) 1030.58 (3.5%) 4.2% ( -2% - 11%) 0.000 AndStopWords 29.62 (4.1%) 30.97 (1.9%) 4.6% ( -1% - 11%) 0.000 OrStopWords 32.89 (7.0%) 34.41 (2.6%) 4.6% ( -4% - 15%) 0.006 LowTerm 978.60 (3.0%) 1023.81 (6.1%) 4.6% ( -4% - 14%) 0.002 CountAndHighMed 140.50 (1.6%) 147.18 (2.6%) 4.8% ( 0% - 9%) 0.000 CountOrHighHigh 57.94 (15.1%) 60.85 (16.7%) 5.0% ( -23% - 43%) 0.319 CountOrHighMed 113.79 (11.2%) 120.21 (13.2%) 5.6% ( -16% - 33%) 0.145 HighTerm 363.62 (5.0%) 384.13 (8.2%) 5.6% ( -7% - 19%) 0.009 MedTerm 546.01 (4.2%) 580.61 (7.9%) 6.3% ( -5% - 19%) 0.002 ``` ### Description <!-- If this is your first contribution to Lucene, please make sure you have reviewed the contribution guide. https://github.com/apache/lucene/blob/main/CONTRIBUTING.md --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org