[ https://issues.apache.org/jira/browse/LUCENE-9845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302649#comment-17302649 ]
Robert Muir commented on LUCENE-9845: ------------------------------------- {quote} So, IndexedDISI would work OK if it provided a random access API. Graph search tends to fetch values from all over the place, and it's not great to continually reset the iterator or create a new one when you need to "go backwards." I actually played around with extending IndexedDISI with random access way back at the beginning of this HNSW patch: https://issues.apache.org/jira/browse/LUCENE-9051, but dropped that idea in the midst of just trying to get the basic feature working. Maybe we should revisit that idea? {quote} I think we need to not waste a lick of time improving random access, and instead prevent these algorithms from testing the user's hard drive seek speed, and figure out how to make them work in hardware-friendly ways (means SEQUENTIAL) > Improve encoding of HNSW graph offsets > -------------------------------------- > > Key: LUCENE-9845 > URL: https://issues.apache.org/jira/browse/LUCENE-9845 > Project: Lucene - Core > Issue Type: Wish > Reporter: Michael Sokolov > Priority: Major > > Today we use a simple {{long[]}} to encode the offset of each document's > array of neighbors in the HNSW graph. Instead we should use a data structure > that is optimized for encoding an increasing numeric array, like monotonic > {{PackedInts}} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org