easyice commented on PR #15206:
URL: https://github.com/apache/lucene/pull/15206#issuecomment-3315760492
I mocked some data to make the doc values use single-block GCD encoding.
After experimenting with different BPV lengths and read densities, benchmarks
indicate performance improvements when reads are dense.
Note that `NumericDocValues#advanceExact` and `NumericDocValues#longValue`
are not polymorphic in this benchmark since they are using the same impl. The
difference would be bigger if they were.
`step` means the interval between doc IDs when calling `advanceExact` on doc
values.
⸻
JMH output
```
Benchmark (bpv) (step) Mode Cnt Score
Error Units
LongValuesBenchmark.bench_longValue 8 1 avgt 5 7.238 ±
0.400 us/op
LongValuesBenchmark.bench_longValue 8 4 avgt 5 1.437 ±
0.038 us/op
LongValuesBenchmark.bench_longValue 8 16 avgt 5 0.401 ±
0.016 us/op
LongValuesBenchmark.bench_longValue 16 1 avgt 5 6.052 ±
2.331 us/op
LongValuesBenchmark.bench_longValue 16 4 avgt 5 1.466 ±
0.056 us/op
LongValuesBenchmark.bench_longValue 16 16 avgt 5 0.439 ±
0.018 us/op
LongValuesBenchmark.bench_longValue 32 1 avgt 5 7.720 ±
0.327 us/op
LongValuesBenchmark.bench_longValue 32 4 avgt 5 1.504 ±
0.025 us/op
LongValuesBenchmark.bench_longValue 32 16 avgt 5 0.466 ±
0.009 us/op
LongValuesBenchmark.bench_longValues 8 1 avgt 5 25.548 ±
1.027 us/op
LongValuesBenchmark.bench_longValues 8 4 avgt 5 8.496 ±
0.331 us/op
LongValuesBenchmark.bench_longValues 8 16 avgt 5 0.352 ±
0.014 us/op
LongValuesBenchmark.bench_longValues 16 1 avgt 5 28.536 ±
1.434 us/op
LongValuesBenchmark.bench_longValues 16 4 avgt 5 8.339 ±
0.038 us/op
LongValuesBenchmark.bench_longValues 16 16 avgt 5 0.351 ±
0.006 us/op
LongValuesBenchmark.bench_longValues 32 1 avgt 5 26.983 ±
0.429 us/op
LongValuesBenchmark.bench_longValues 32 4 avgt 5 1.346 ±
0.113 us/op
LongValuesBenchmark.bench_longValues 32 16 avgt 5 0.346 ±
0.009 us/op
```
<details>
<summary >Code</summary>
```
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Benchmark)
@Warmup(iterations = 3, time = 3)
@Measurement(iterations = 5, time = 5)
@Fork(
value = 1,
jvmArgsPrepend = {"--add-modules=jdk.unsupported",
"--enable-native-access=ALL-UNNAMED"})
public class LongValuesBenchmark {
private IndexReader reader;
private final int numDocs = 10000;
private final long[] values = new long[numDocs];
private final int[] docs = new int[numDocs];
private int docsSize;
@Param({"1", "4", "16"})
//@Param({"16"})
public int step;
@Param({"8", "16", "32"})
//@Param({"8"})
public int bpv;
@Setup(Level.Trial)
public void setup() throws Exception {
Path path = Files.createTempDirectory("longValues");
Directory dir = MMapDirectory.open(path);
IndexWriter w =
new IndexWriter(
dir,
new IndexWriterConfig()
.setOpenMode(IndexWriterConfig.OpenMode.CREATE)
.setMaxBufferedDocs(IndexWriterConfig.DISABLE_AUTO_FLUSH));
Random r = new Random(0);
final int gcd = 1000;
for (int i = 0; i < numDocs; i++) {
Document doc = new Document();
long v = r.nextLong(1L << (bpv - 1), 1L << bpv) * gcd;
doc.add(new NumericDocValuesField("f", v));
w.addDocument(doc);
}
docsSize = 0;
for (int i = 0; i< numDocs; i+=step) {
docs[docsSize++] = i;
}
w.commit();
w.forceMerge(1);
w.commit();
reader = DirectoryReader.open(w);
w.close();
}
@Benchmark
public void bench_longValue(Blackhole bh) throws IOException {
NumericDocValues ndv;
ndv = reader.leaves().get(0).reader().getNumericDocValues("f");
for (int i = 0; i < numDocs; i += step) {
ndv.advanceExact(i);
values[i] = ndv.longValue();
}
bh.consume(values);
}
@Benchmark
public void bench_longValues(Blackhole bh) throws IOException {
NumericDocValues ndv;
ndv = reader.leaves().get(0).reader().getNumericDocValues("f");
ndv.longValues(docsSize, docs, values, 0);
bh.consume(values);
}
}
```
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]