[
https://issues.apache.org/jira/browse/LUCENE-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17314116#comment-17314116
]
Greg Miller commented on LUCENE-9850:
-------------------------------------
I think I've actually made good progress on optimizing this. The last bit to
crack was reading the exception bytes in all at once instead of byte-by-byte.
The benchmark results look pretty promising to me. I'd be curious what others
think of these results. If others agree they look promising, I'll tidy up the
[code|https://github.com/apache/lucene/compare/main...gsmiller:LUCENE-9850/pfordocids2],
write some tests and put a PR out there. And just as a reminder, this change
reduces the doc ID data in the Wikipedia benchmark index by *~12%*, resulting
in an *~3.3% overall index size reduction*.
Benchmark results are as follows. It does look like there's a significant
regression in some task, but much smaller than before. But now there are also
some significant QPS improvements for some other tasks. Seems like decent
results overall, but others will have more opinions on this I'm sure.
{code:java}
TaskQPS baseline StdDevQPS pfordocids StdDev
Pct diff p-value
HighTermDayOfYearSort 76.66 (13.5%) 74.15 (12.0%)
-3.3% ( -25% - 25%) 0.418
HighTermMonthSort 105.44 (10.3%) 102.66 (8.5%)
-2.6% ( -19% - 18%) 0.379
LowSpanNear 38.95 (2.3%) 38.05 (1.7%)
-2.3% ( -6% - 1%) 0.000
AndHighMed 85.41 (2.6%) 83.58 (2.5%)
-2.1% ( -7% - 3%) 0.008
OrHighHigh 16.29 (1.9%) 16.00 (6.1%)
-1.8% ( -9% - 6%) 0.209
TermDTSort 186.08 (11.7%) 182.87 (10.0%)
-1.7% ( -20% - 22%) 0.616
MedSpanNear 12.45 (2.2%) 12.24 (1.6%)
-1.7% ( -5% - 2%) 0.005
HighIntervalsOrdered 6.11 (2.8%) 6.00 (2.8%)
-1.7% ( -7% - 4%) 0.061
HighTermTitleBDVSort 61.20 (10.8%) 60.19 (10.2%)
-1.7% ( -20% - 21%) 0.617
MedSloppyPhrase 29.64 (4.6%) 29.16 (4.5%)
-1.6% ( -10% - 7%) 0.262
AndHighHigh 58.98 (1.9%) 58.05 (2.2%)
-1.6% ( -5% - 2%) 0.015
OrHighMed 62.28 (2.6%) 61.33 (6.0%)
-1.5% ( -9% - 7%) 0.293
LowSloppyPhrase 11.15 (3.2%) 10.98 (3.2%)
-1.5% ( -7% - 5%) 0.134
HighSpanNear 7.33 (2.7%) 7.25 (2.3%)
-1.2% ( -6% - 3%) 0.143
Prefix3 239.67 (15.4%) 237.28 (18.6%)
-1.0% ( -30% - 38%) 0.853
LowPhrase 131.88 (2.3%) 130.75 (2.6%)
-0.9% ( -5% - 4%) 0.264
OrHighLow 264.48 (5.4%) 262.67 (5.7%)
-0.7% ( -11% - 10%) 0.695
BrowseDayOfYearSSDVFacets 11.57 (4.5%) 11.49 (5.0%)
-0.7% ( -9% - 9%) 0.661
BrowseMonthSSDVFacets 12.38 (6.6%) 12.31 (7.9%)
-0.6% ( -14% - 14%) 0.799
Fuzzy2 52.44 (12.0%) 52.24 (12.3%)
-0.4% ( -22% - 27%) 0.920
PKLookup 146.58 (3.8%) 146.05 (4.1%)
-0.4% ( -7% - 7%) 0.774
HighSloppyPhrase 1.78 (5.1%) 1.77 (4.2%)
-0.3% ( -9% - 9%) 0.839
BrowseDayOfYearTaxoFacets 4.15 (3.4%) 4.15 (2.9%)
-0.0% ( -6% - 6%) 0.962
BrowseDateTaxoFacets 4.15 (3.4%) 4.14 (2.8%)
-0.0% ( -6% - 6%) 0.964
BrowseMonthTaxoFacets 5.02 (2.7%) 5.02 (2.4%)
0.1% ( -4% - 5%) 0.920
MedPhrase 87.31 (3.1%) 87.43 (3.1%)
0.1% ( -5% - 6%) 0.893
Respell 44.49 (2.2%) 44.74 (2.8%)
0.6% ( -4% - 5%) 0.484
Wildcard 54.13 (3.3%) 54.49 (3.8%)
0.7% ( -6% - 7%) 0.557
OrHighNotMed 521.75 (5.5%) 526.04 (6.2%)
0.8% ( -10% - 13%) 0.659
Fuzzy1 54.77 (9.9%) 55.25 (7.8%)
0.9% ( -15% - 20%) 0.753
IntNRQ 94.71 (0.5%) 95.68 (1.3%)
1.0% ( 0% - 2%) 0.001
OrNotHighLow 618.18 (5.1%) 624.93 (6.1%)
1.1% ( -9% - 12%) 0.537
OrNotHighMed 473.89 (3.2%) 479.65 (5.4%)
1.2% ( -7% - 10%) 0.388
MedTerm 1115.74 (4.5%) 1130.77 (6.8%)
1.3% ( -9% - 13%) 0.463
OrHighNotHigh 428.11 (4.0%) 436.25 (6.9%)
1.9% ( -8% - 13%) 0.286
HighPhrase 86.34 (2.6%) 87.99 (3.8%)
1.9% ( -4% - 8%) 0.064
AndHighLow 612.62 (4.5%) 625.92 (6.6%)
2.2% ( -8% - 13%) 0.223
OrHighNotLow 551.28 (4.5%) 567.10 (7.0%)
2.9% ( -8% - 15%) 0.125
HighTerm 917.35 (2.5%) 945.58 (4.8%)
3.1% ( -4% - 10%) 0.011
OrNotHighHigh 473.03 (6.0%) 494.08 (8.5%)
4.4% ( -9% - 20%) 0.056
LowTerm 1060.12 (3.2%) 1111.79 (6.5%)
4.9% ( -4% - 15%) 0.003
{code}
Note the flame chart difference specifically for the "applyExceptionsIn32Space"
step by reading all bytes at once (compared to my last comment above), which
seems to have helped out. Reading the exception bytes now appears to be ~1/2 of
the time spent in applying the exceptions vs. ~2/3 as before:
!bulk_read_1.png|width=845,height=109!
!bulk_read_2.png|width=839,height=84!
> Explore PFOR for Doc ID delta encoding (instead of FOR)
> -------------------------------------------------------
>
> Key: LUCENE-9850
> URL: https://issues.apache.org/jira/browse/LUCENE-9850
> Project: Lucene - Core
> Issue Type: Task
> Components: core/codecs
> Affects Versions: main (9.0)
> Reporter: Greg Miller
> Priority: Minor
> Attachments: apply_exceptions.png, bulk_read_1.png, bulk_read_2.png,
> for.png, pfor.png
>
>
> It'd be interesting to explore using PFOR instead of FOR for doc ID encoding.
> Right now PFOR is used for positions, frequencies and payloads, but FOR is
> used for doc ID deltas. From a recent
> [conversation|http://mail-archives.apache.org/mod_mbox/lucene-dev/202103.mbox/%3CCAPsWd%2BOp7d_GxNosB5r%3DQMPA-v0SteHWjXUmG3gwQot4gkubWw%40mail.gmail.com%3E]
> on the dev mailing list, it sounds like this decision was made based on the
> optimization possible when expanding the deltas.
> I'd be interesting in measuring the index size reduction possible with
> switching to PFOR compared to the performance reduction we might see by no
> longer being able to apply the deltas in as optimal a way.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]