[ 
https://issues.apache.org/jira/browse/LUCENE-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422467#comment-17422467
 ] 

Julie Tibshirani edited comment on LUCENE-10130 at 9/30/21, 1:03 AM:
---------------------------------------------------------------------

+1 to switch to a better data structure that could help perf. Some context that 
might be helpful: this set tracks the graph nodes that have already been 
visited during a kNN search. In 
https://issues.apache.org/jira/browse/LUCENE-9937 on dataset of 1M points, I 
saw the number of visited nodes range from around 1000 to 20,000, depending on 
k. It's supposed to be roughly logarithmic. When there are deleted docs the 
number could climb much higher, since we might visit a bunch of deleted docs 
until we find good vectors to return.


was (Author: julietibs):
+1 to switch to a better data structure that could help perf. Some context that 
might be helpful: this set tracks the graph nodes that have already been 
visited during a kNN search. In 
https://issues.apache.org/jira/browse/LUCENE-9937 on dataset of 1M points, I 
saw the number of visited nodes range from around 1000 to 20,000, depending on 
k. When there are deleted docs this number could climb much higher, since we 
might visit a bunch of deleted docs until we find good vectors to return.

> HnswGraph could make use of a SparseFixedBitSet.getAndSet
> ---------------------------------------------------------
>
>                 Key: LUCENE-10130
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10130
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Robert Muir
>            Priority: Major
>         Attachments: LUCENE-10130.patch
>
>
> Currently HnswGraph uses SparseFixedBitSet "visited" to track where it has 
> already been. The logic currently looks like this:
> {code}
> if (visited.get(entryPoint) == false) {
>   visited.set(entryPoint);
>   ... logic ...
> }
> {code}
> If SparseFixedBitSet had a {{getAndSet}} (like FixedBitSet), the code could 
> be:
> {code}
> if (visited.getAndSet(entrypoint) == false) {
>   ... logic ...
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to