[jira] [Commented] (LUCENE-9845) Improve encoding of HNSW graph offsets

Robert Muir (Jira) Tue, 16 Mar 2021 08:56:05 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-9845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302645#comment-17302645
 ]


Robert Muir commented on LUCENE-9845:
-------------------------------------

OK, but I don't think that is typical: we should optimize for typical cases. It 
is of course ok to separately optimize sparse case too, same as DocValues. But 
from what I see, the dense case is currently suffering (instantiating large 
arrays into RAM, binary searching these arrays, arrays that shouldnt need to be 
there...)

> Improve encoding of HNSW graph offsets
> --------------------------------------
>
>                 Key: LUCENE-9845
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9845
>             Project: Lucene - Core
>          Issue Type: Wish
>            Reporter: Michael Sokolov
>            Priority: Major
>
> Today we use a simple {{long[]}} to encode the offset of each document's 
> array of neighbors in the HNSW graph. Instead we should use a data structure 
> that is optimized for encoding an increasing numeric array, like   monotonic 
> {{PackedInts}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9845) Improve encoding of HNSW graph offsets

Reply via email to