[ 
https://issues.apache.org/jira/browse/LUCENE-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaival Parikh updated LUCENE-10611:
-----------------------------------
    Description: 
The HNSW graph search does not consider that visitedLimit may be reached in the 
upper levels of graph search itself

This occurs when the pre-filter is too restrictive (and its count sets the 
visitedLimit). So instead of switching over to exactSearch, it tries to [pop 
from an empty 
heap|https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java#L90]
 and throws an error

 

To reproduce this error, we can +increase the numDocs 
[here|https://github.com/apache/lucene/blob/main/lucene/core/src/test/org/apache/lucene/search/TestKnnVectorQuery.java#L500]
 to 20,000+ (so that nodes have more neighbors, and visitedLimit is reached 
faster)

 

Stacktrace:
The heap is empty
java.lang.IllegalStateException: The heap is empty
at __randomizedtesting.SeedInfo.seed([D7BC2F56048D9D1A:A1F576DD0E795BBF]:0)
at org.apache.lucene.util.LongHeap.pop(LongHeap.java:111)
at org.apache.lucene.util.hnsw.NeighborQueue.pop(NeighborQueue.java:98)
at 
org.apache.lucene.util.hnsw.HnswGraphSearcher.search(HnswGraphSearcher.java:90)
at 
org.apache.lucene.codecs.lucene92.Lucene92HnswVectorsReader.search(Lucene92HnswVectorsReader.java:236)
at 
org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat$FieldsReader.search(PerFieldKnnVectorsFormat.java:272)
at 
org.apache.lucene.index.CodecReader.searchNearestVectors(CodecReader.java:235)
at 
org.apache.lucene.search.KnnVectorQuery.approximateSearch(KnnVectorQuery.java:159)

  was:
The HNSW graph search does not consider that visitedLimit may be reached in the 
upper levels of graph search itself

This occurs when the pre-filter is too restrictive (and its count sets the 
visitedLimit). So instead of switching over to exactSearch, it tries to [pop 
from an empty 
heap|https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java#L90]
 and throws an error

 

To reproduce this error, we can +increase the numDocs 
[here|https://github.com/apache/lucene/blob/main/lucene/core/src/test/org/apache/lucene/search/TestKnnVectorQuery.java#L500]
 to 20,000+ (so that nodes have more neighbors, and visitedLimit is reached 
faster)

 

Stacktrace:
`The heap is empty
java.lang.IllegalStateException: The heap is empty
at __randomizedtesting.SeedInfo.seed([D7BC2F56048D9D1A:A1F576DD0E795BBF]:0)
at org.apache.lucene.util.LongHeap.pop(LongHeap.java:111)
at org.apache.lucene.util.hnsw.NeighborQueue.pop(NeighborQueue.java:98)
at 
org.apache.lucene.util.hnsw.HnswGraphSearcher.search(HnswGraphSearcher.java:90)
at 
org.apache.lucene.codecs.lucene92.Lucene92HnswVectorsReader.search(Lucene92HnswVectorsReader.java:236)
at 
org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat$FieldsReader.search(PerFieldKnnVectorsFormat.java:272)
at 
org.apache.lucene.index.CodecReader.searchNearestVectors(CodecReader.java:235)
at 
org.apache.lucene.search.KnnVectorQuery.approximateSearch(KnnVectorQuery.java:159)`


> KnnVectorQuery throwing Heap Error for Restrictive Filters
> ----------------------------------------------------------
>
>                 Key: LUCENE-10611
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10611
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Kaival Parikh
>            Priority: Minor
>
> The HNSW graph search does not consider that visitedLimit may be reached in 
> the upper levels of graph search itself
> This occurs when the pre-filter is too restrictive (and its count sets the 
> visitedLimit). So instead of switching over to exactSearch, it tries to [pop 
> from an empty 
> heap|https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java#L90]
>  and throws an error
>  
> To reproduce this error, we can +increase the numDocs 
> [here|https://github.com/apache/lucene/blob/main/lucene/core/src/test/org/apache/lucene/search/TestKnnVectorQuery.java#L500]
>  to 20,000+ (so that nodes have more neighbors, and visitedLimit is reached 
> faster)
>  
> Stacktrace:
> The heap is empty
> java.lang.IllegalStateException: The heap is empty
> at __randomizedtesting.SeedInfo.seed([D7BC2F56048D9D1A:A1F576DD0E795BBF]:0)
> at org.apache.lucene.util.LongHeap.pop(LongHeap.java:111)
> at org.apache.lucene.util.hnsw.NeighborQueue.pop(NeighborQueue.java:98)
> at 
> org.apache.lucene.util.hnsw.HnswGraphSearcher.search(HnswGraphSearcher.java:90)
> at 
> org.apache.lucene.codecs.lucene92.Lucene92HnswVectorsReader.search(Lucene92HnswVectorsReader.java:236)
> at 
> org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat$FieldsReader.search(PerFieldKnnVectorsFormat.java:272)
> at 
> org.apache.lucene.index.CodecReader.searchNearestVectors(CodecReader.java:235)
> at 
> org.apache.lucene.search.KnnVectorQuery.approximateSearch(KnnVectorQuery.java:159)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to