sgup432 opened a new pull request, #16050: URL: https://github.com/apache/lucene/pull/16050
### Description Numeric range queries on dense fields use DocValuesRangeIterator, which is a TwoPhaseIterator that uses SkipBlockRangeIterator as an approximation. This works well, but for MAYBE blocks (where values partially overlap the query range), it still falls back to per-doc evaluation: each doc is checked individually via values.advance(doc) + values.longValue() + range comparison. Since DocValuesRangeIterator is a TwoPhaseIterator, `DenseConjunctionBulkScorer` routes it through the leap-frog path(see [here](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java#L201C5-L208C6)) and `intoBitSet()` is never called. This means SIMD is never used for MAYBE block evaluation, even though the underlying storage for dense fields is a packed long[] that's ideal for vectorized comparison. ### PR changes For dense singleton numeric fields with a skip index, replace DocValuesRangeIterator with a new `BatchDocValuesRangeIterator` which is a plain DocIdSetIterator (not TwoPhaseIterator). This was added so that we force DenseConjunctionBulkScorer to call intoBitSet() on it directly, enabling the bitset intersection path. I am open to suggestion if this is a right approach This PR also adds support to do SIMD-accelerated bulk range evaluation for MAYBE (partial overlap) blocks, which seem to be the most expensive case when running range queries through doc values. For this we added below changes: - Add `NumericDocValues.rangeIntoBitSet(fromDoc, toDoc, minValue, maxValue, bitSet, offset)`: a new bulk API with a per-doc fallback default. Lucene90DocValuesProducer overrides this for dense fields to dispatch to the vectorization layer. - Add a **DocValuesRangeSupport** interface with two implementations: - **PanamaDocValuesRangeSupport** — SIMD implementation using the Panama Vector API (LongVector.SPECIES_PREFERRED). Evaluates multiple values per CPU instruction using vectorized range comparisons. - **DefaultDocValuesRangeSupport** — scalar tight loop fallback. - `VectorizationProvider.getDocValuesRangeSupport()` returns the appropriate implementation at startup. ### Benchmarks ``` MultiFieldDocValuesRangeBenchmark (c5.2xlarge, AVX-512) Mode: Throughput (ops/s, higher is better) JVM args: --add-modules=jdk.incubator.vector Warmup: 3 x 3s, Measurement: 5 x 5s, Fork: 1 ``` Data Pattern | docCount | Fields | Baseline (ops/s) | Optimized (ops/s) | Change -------------|----------|---------|------------------|-------------------|---------- random | 1M | 1 | 59.99 | 208.27 | +247% random | 1M | 3 | 34.83 | 69.30 | +99% random | 1M | 5 | 29.40 | 65.10 | +121% random | 10M | 1 | 6.12 | 25.16 | +311% random | 10M | 3 | 3.41 | 8.38 | +146% random | 10M | 5 | 2.82 | 7.45 | +164% clustered | 1M | 1 | 6231.86 | 8584.63 | +38% clustered | 1M | 3 | 9142.82 | 35488.66 | +288% clustered | 1M | 5 | 7072.30 | 32583.89 | +361% clustered | 10M | 1 | 685.27 | 1253.04 | +83% clustered | 10M | 3 | 8314.53 | 23913.65 | +188% clustered | 10M | 5 | 8855.14 | 12703.13 | +43% The numbers look great across the board! <!-- If this is your first contribution to Lucene, please make sure you have reviewed the contribution guide. https://github.com/apache/lucene/blob/main/CONTRIBUTING.md --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
