[ https://issues.apache.org/jira/browse/LUCENE-9051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976529#comment-16976529 ]
Adrien Grand commented on LUCENE-9051: -------------------------------------- I'm assuming we started by looking at doc-value formats in LUCENE-9004 because that was convenient, but eventually we'd have a separate XXXFormat abstraction instead? Then you could fork IndexedDisi to not implement DocIdSetIterator anymore (which is the reason why it enforces sequential access) and support backward access too and use it as an implementation detail of XXXFormat? > Implement random access seeks in IndexedDISI (DocValues) > -------------------------------------------------------- > > Key: LUCENE-9051 > URL: https://issues.apache.org/jira/browse/LUCENE-9051 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael Sokolov > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > In LUCENE-9004 we have a use case for random-access seeking in DocValues, > which currently only support forward-only iteration (with efficient > skipping). One idea there was to write an entirely new format to cover these > cases. While looking into that, I noticed that our current DocValues > addressing implementation, {{IndexedDISI}}, already has a pretty good basis > for providing random accesses. I worked up a patch that does that; we already > have the ability to jump to a block, thanks to the jump-tables added last > year by [~toke]; the patch uses that, and/or rewinds the iteration within > current block as needed. > I did a very simple performance test, comparing forward-only iteration with > random seeks, and in my test I saw no difference, but that can't be right, so > I wonder if we have a more thorough performance test of DocValues somwhere > that I could repurpose. Probably I'll go back and dig into the issue where we > added the jump tables - I seem to recall some testing was done then. > Aside from performance testing the implementation, there is the question > should we alter our API guarantees in this way. This might be controversial, > I don't know the history or all the reasoning behind the way it is today. We > provide {{advanceExact}} and some implementations support docids going > backwards, others don't. {{AssertingNumericDocValues.advanceExact}} does > enforce forward-iteration (in tests); what would the consequence be of > relaxing that? We'd then open ourselves up to requiring all DV impls to > support random access. Are there other impls to worry about though? I'm not > sure. I'd appreciate y'all's input on this one. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org