Michael Sokolov created LUCENE-9051:
---------------------------------------

             Summary: Implement random access seeks in IndexedDISI (DocValues)
                 Key: LUCENE-9051
                 URL: https://issues.apache.org/jira/browse/LUCENE-9051
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Michael Sokolov


In LUCENE-9004 we have a use case for random-access seeking in DocValues, which 
currently only support forward-only iteration (with efficient skipping). One 
idea there was to write an entirely new format to cover these cases. While 
looking into that, I noticed that our current DocValues addressing 
implementation, {{IndexedDISI}}, already has a pretty good basis for providing 
random accesses. I worked up a patch that does that; we already have the 
ability to jump to a block, thanks to the jump-tables added last year by 
[~toke]; the patch uses that, and/or rewinds the iteration within current block 
as needed.

I did a very simple performance test, comparing forward-only iteration with 
random seeks, and in my test I saw no difference, but that can't be right, so I 
wonder if we have a more thorough performance test of DocValues somwhere that I 
could repurpose. Probably I'll go back and dig into the issue where we added 
the jump tables - I seem to recall some testing was done then.

Aside from performance testing the implementation, there is the question should 
we alter our API guarantees in this way. This might be controversial, I don't 
know the history or all the reasoning behind the way it is today. We provide 
{{advanceExact}} and some implementations support docids going backwards, 
others don't.  {{AssertingNumericDocValues.advanceExact}} does  enforce 
forward-iteration (in tests); what would the consequence be of relaxing that? 
We'd then open ourselves up to requiring all DV impls to support random access. 
Are there other impls to worry about though? I'm not sure. I'd appreciate 
y'all's input on this one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to