Re: [PR] Inline skip data into postings lists [lucene]

via GitHub Fri, 19 Jul 2024 23:23:44 -0700


jpountz commented on PR #13585:
URL: https://github.com/apache/lucene/pull/13585#issuecomment-2240948449


   > I couldn't see where we also wrote the file pointer into pos/pay files
   
   Indeed level 0 doesn't write pointers into pos/pay, it only records the 
total term freq of the block to know how many positions to skip when skipping a 
block. This may be the reason for the slowdown with phrase queries, I will look 
into recording pointers instead. Level 1 does record pointers though.
   
   > It looks like every 4096 (SKIP_FACTOR * BLOCK_SIZE) docs we will insert a 
blob holding the skip data (bi-level skip list) for the next 4096 docs?
   
   This is correct.
   
   > Maybe you could add a quick description into the 
Lucene912PostingFormat.java about how the bi-level skip list is encoded. It's 
so simple, I love it.
   
   I'll do it. As you can imagine, it's changed many (many many) times as I was 
figuring out how to make skipping fast and skip data space-efficient, but now 
that I'm starting to be happy with it, better file format description would 
indeed be helpful!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Inline skip data into postings lists [lucene]

Reply via email to