[ 
https://issues.apache.org/jira/browse/LUCENE-9845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302649#comment-17302649
 ] 

Robert Muir commented on LUCENE-9845:
-------------------------------------

{quote}
So, IndexedDISI would work OK if it provided a random access API. Graph search 
tends to fetch values from all over the place, and it's not great to 
continually reset the iterator or create a new one when you need to "go 
backwards." I actually played around with extending IndexedDISI with random 
access way back at the beginning of this HNSW patch: 
https://issues.apache.org/jira/browse/LUCENE-9051,  but dropped that idea in 
the midst of just trying to get the basic feature working. Maybe we should 
revisit that idea?
{quote}

I think we need to not waste a lick of time improving random access, and 
instead prevent these algorithms from testing the user's hard drive seek speed, 
and figure out how to make them work in hardware-friendly ways (means 
SEQUENTIAL)

> Improve encoding of HNSW graph offsets
> --------------------------------------
>
>                 Key: LUCENE-9845
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9845
>             Project: Lucene - Core
>          Issue Type: Wish
>            Reporter: Michael Sokolov
>            Priority: Major
>
> Today we use a simple {{long[]}} to encode the offset of each document's 
> array of neighbors in the HNSW graph. Instead we should use a data structure 
> that is optimized for encoding an increasing numeric array, like   monotonic 
> {{PackedInts}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to