jpountz commented on PR #13585: URL: https://github.com/apache/lucene/pull/13585#issuecomment-2240948449
> I couldn't see where we also wrote the file pointer into pos/pay files Indeed level 0 doesn't write pointers into pos/pay, it only records the total term freq of the block to know how many positions to skip when skipping a block. This may be the reason for the slowdown with phrase queries, I will look into recording pointers instead. Level 1 does record pointers though. > It looks like every 4096 (SKIP_FACTOR * BLOCK_SIZE) docs we will insert a blob holding the skip data (bi-level skip list) for the next 4096 docs? This is correct. > Maybe you could add a quick description into the Lucene912PostingFormat.java about how the bi-level skip list is encoded. It's so simple, I love it. I'll do it. As you can imagine, it's changed many (many many) times as I was figuring out how to make skipping fast and skip data space-efficient, but now that I'm starting to be happy with it, better file format description would indeed be helpful! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org