[ https://issues.apache.org/jira/browse/LUCENE-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17314116#comment-17314116 ]
Greg Miller commented on LUCENE-9850: ------------------------------------- I think I've actually made good progress on optimizing this. The last bit to crack was reading the exception bytes in all at once instead of byte-by-byte. The benchmark results look pretty promising to me. I'd be curious what others think of these results. If others agree they look promising, I'll tidy up the [code|https://github.com/apache/lucene/compare/main...gsmiller:LUCENE-9850/pfordocids2], write some tests and put a PR out there. And just as a reminder, this change reduces the doc ID data in the Wikipedia benchmark index by *~12%*, resulting in an *~3.3% overall index size reduction*. Benchmark results are as follows. It does look like there's a significant regression in some task, but much smaller than before. But now there are also some significant QPS improvements for some other tasks. Seems like decent results overall, but others will have more opinions on this I'm sure. {code:java} TaskQPS baseline StdDevQPS pfordocids StdDev Pct diff p-value HighTermDayOfYearSort 76.66 (13.5%) 74.15 (12.0%) -3.3% ( -25% - 25%) 0.418 HighTermMonthSort 105.44 (10.3%) 102.66 (8.5%) -2.6% ( -19% - 18%) 0.379 LowSpanNear 38.95 (2.3%) 38.05 (1.7%) -2.3% ( -6% - 1%) 0.000 AndHighMed 85.41 (2.6%) 83.58 (2.5%) -2.1% ( -7% - 3%) 0.008 OrHighHigh 16.29 (1.9%) 16.00 (6.1%) -1.8% ( -9% - 6%) 0.209 TermDTSort 186.08 (11.7%) 182.87 (10.0%) -1.7% ( -20% - 22%) 0.616 MedSpanNear 12.45 (2.2%) 12.24 (1.6%) -1.7% ( -5% - 2%) 0.005 HighIntervalsOrdered 6.11 (2.8%) 6.00 (2.8%) -1.7% ( -7% - 4%) 0.061 HighTermTitleBDVSort 61.20 (10.8%) 60.19 (10.2%) -1.7% ( -20% - 21%) 0.617 MedSloppyPhrase 29.64 (4.6%) 29.16 (4.5%) -1.6% ( -10% - 7%) 0.262 AndHighHigh 58.98 (1.9%) 58.05 (2.2%) -1.6% ( -5% - 2%) 0.015 OrHighMed 62.28 (2.6%) 61.33 (6.0%) -1.5% ( -9% - 7%) 0.293 LowSloppyPhrase 11.15 (3.2%) 10.98 (3.2%) -1.5% ( -7% - 5%) 0.134 HighSpanNear 7.33 (2.7%) 7.25 (2.3%) -1.2% ( -6% - 3%) 0.143 Prefix3 239.67 (15.4%) 237.28 (18.6%) -1.0% ( -30% - 38%) 0.853 LowPhrase 131.88 (2.3%) 130.75 (2.6%) -0.9% ( -5% - 4%) 0.264 OrHighLow 264.48 (5.4%) 262.67 (5.7%) -0.7% ( -11% - 10%) 0.695 BrowseDayOfYearSSDVFacets 11.57 (4.5%) 11.49 (5.0%) -0.7% ( -9% - 9%) 0.661 BrowseMonthSSDVFacets 12.38 (6.6%) 12.31 (7.9%) -0.6% ( -14% - 14%) 0.799 Fuzzy2 52.44 (12.0%) 52.24 (12.3%) -0.4% ( -22% - 27%) 0.920 PKLookup 146.58 (3.8%) 146.05 (4.1%) -0.4% ( -7% - 7%) 0.774 HighSloppyPhrase 1.78 (5.1%) 1.77 (4.2%) -0.3% ( -9% - 9%) 0.839 BrowseDayOfYearTaxoFacets 4.15 (3.4%) 4.15 (2.9%) -0.0% ( -6% - 6%) 0.962 BrowseDateTaxoFacets 4.15 (3.4%) 4.14 (2.8%) -0.0% ( -6% - 6%) 0.964 BrowseMonthTaxoFacets 5.02 (2.7%) 5.02 (2.4%) 0.1% ( -4% - 5%) 0.920 MedPhrase 87.31 (3.1%) 87.43 (3.1%) 0.1% ( -5% - 6%) 0.893 Respell 44.49 (2.2%) 44.74 (2.8%) 0.6% ( -4% - 5%) 0.484 Wildcard 54.13 (3.3%) 54.49 (3.8%) 0.7% ( -6% - 7%) 0.557 OrHighNotMed 521.75 (5.5%) 526.04 (6.2%) 0.8% ( -10% - 13%) 0.659 Fuzzy1 54.77 (9.9%) 55.25 (7.8%) 0.9% ( -15% - 20%) 0.753 IntNRQ 94.71 (0.5%) 95.68 (1.3%) 1.0% ( 0% - 2%) 0.001 OrNotHighLow 618.18 (5.1%) 624.93 (6.1%) 1.1% ( -9% - 12%) 0.537 OrNotHighMed 473.89 (3.2%) 479.65 (5.4%) 1.2% ( -7% - 10%) 0.388 MedTerm 1115.74 (4.5%) 1130.77 (6.8%) 1.3% ( -9% - 13%) 0.463 OrHighNotHigh 428.11 (4.0%) 436.25 (6.9%) 1.9% ( -8% - 13%) 0.286 HighPhrase 86.34 (2.6%) 87.99 (3.8%) 1.9% ( -4% - 8%) 0.064 AndHighLow 612.62 (4.5%) 625.92 (6.6%) 2.2% ( -8% - 13%) 0.223 OrHighNotLow 551.28 (4.5%) 567.10 (7.0%) 2.9% ( -8% - 15%) 0.125 HighTerm 917.35 (2.5%) 945.58 (4.8%) 3.1% ( -4% - 10%) 0.011 OrNotHighHigh 473.03 (6.0%) 494.08 (8.5%) 4.4% ( -9% - 20%) 0.056 LowTerm 1060.12 (3.2%) 1111.79 (6.5%) 4.9% ( -4% - 15%) 0.003 {code} Note the flame chart difference specifically for the "applyExceptionsIn32Space" step by reading all bytes at once (compared to my last comment above), which seems to have helped out. Reading the exception bytes now appears to be ~1/2 of the time spent in applying the exceptions vs. ~2/3 as before: !bulk_read_1.png|width=845,height=109! !bulk_read_2.png|width=839,height=84! > Explore PFOR for Doc ID delta encoding (instead of FOR) > ------------------------------------------------------- > > Key: LUCENE-9850 > URL: https://issues.apache.org/jira/browse/LUCENE-9850 > Project: Lucene - Core > Issue Type: Task > Components: core/codecs > Affects Versions: main (9.0) > Reporter: Greg Miller > Priority: Minor > Attachments: apply_exceptions.png, bulk_read_1.png, bulk_read_2.png, > for.png, pfor.png > > > It'd be interesting to explore using PFOR instead of FOR for doc ID encoding. > Right now PFOR is used for positions, frequencies and payloads, but FOR is > used for doc ID deltas. From a recent > [conversation|http://mail-archives.apache.org/mod_mbox/lucene-dev/202103.mbox/%3CCAPsWd%2BOp7d_GxNosB5r%3DQMPA-v0SteHWjXUmG3gwQot4gkubWw%40mail.gmail.com%3E] > on the dev mailing list, it sounds like this decision was made based on the > optimization possible when expanding the deltas. > I'd be interesting in measuring the index size reduction possible with > switching to PFOR compared to the performance reduction we might see by no > longer being able to apply the deltas in as optimal a way. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org