[ https://issues.apache.org/jira/browse/LUCENE-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17315773#comment-17315773 ]
Michael McCandless commented on LUCENE-9850: -------------------------------------------- {quote}I might except the opposite actually. Anytime the PFOR approach has to apply exceptions, I would expect a performance hit of some sort since it has extra work to do on top of the FOR approach used today. So if the Term tasks are largely dominated by postings decoding, I would expect regressions to show up there more than elsewhere. Maybe I'm misunderstanding your comment though? {quote} Ahh I see, you're right! Applying more exceptions should be more costly. Though, because we compress better, we are reading fewer bytes for the same block of ints (since more exceptions are pulled out and bpv can sometimes be lower), and maybe the bulk decode of lower bpv is faster? Hard to predict :) And net/net the QPS impact looks to be noise, but the better compression is real. So yeah +1 to do this – it sounds great :) > Explore PFOR for Doc ID delta encoding (instead of FOR) > ------------------------------------------------------- > > Key: LUCENE-9850 > URL: https://issues.apache.org/jira/browse/LUCENE-9850 > Project: Lucene - Core > Issue Type: Task > Components: core/codecs > Affects Versions: main (9.0) > Reporter: Greg Miller > Priority: Minor > Attachments: apply_exceptions.png, bulk_read_1.png, bulk_read_2.png, > for.png, pfor.png > > > It'd be interesting to explore using PFOR instead of FOR for doc ID encoding. > Right now PFOR is used for positions, frequencies and payloads, but FOR is > used for doc ID deltas. From a recent > [conversation|http://mail-archives.apache.org/mod_mbox/lucene-dev/202103.mbox/%3CCAPsWd%2BOp7d_GxNosB5r%3DQMPA-v0SteHWjXUmG3gwQot4gkubWw%40mail.gmail.com%3E] > on the dev mailing list, it sounds like this decision was made based on the > optimization possible when expanding the deltas. > I'd be interesting in measuring the index size reduction possible with > switching to PFOR compared to the performance reduction we might see by no > longer being able to apply the deltas in as optimal a way. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org