shubhamvishu commented on PR #14963:
URL: https://github.com/apache/lucene/pull/14963#issuecomment-3089205957

   @jpountz Ahh, I see what you are pointing towards and here is what think we 
could try maybe :
   
   - We currently also fallback to exact search after the visitedLimit is 
breached in HNSW search, so now that same visited limit would be applicable 
when we are iterating over the docs i.e. net-net `approximateKnn (visit V 
nodes) + exactSearch` ~== `exactSearch (visit V nodes linearly) + exactSearch` 
which I might not impact the search time?. So one way is to gulp this since we 
will visit small no. of docs but I agree we can further optimize this path 
(more on this below points)
   - We could completely remove the fallback to exactSearch in 
`AbstractKnnVectorQuery` and we could relax the check from
        - `if (knnCollector.earlyTerminated())` to 
        - `if (knnCollector instanceof 
TimeLimitingKnnCollectorManager.TimeLimitingKnnCollector && 
                 
((TimeLimitingKnnCollectorManager.TimeLimitingKnnCollector)knnCollector).shouldExit())`
 after making `TimeLimitingKnnCollector` public and exposing `shouldExit()`
    
     This would ensure we continue the exact search `VectorsReader` and don't 
fallback to exactSearch in `AbstractKnnVectorQuery`. (we can do better maybe, 
more on it below)
   - Though I think `AbstractKnnVectorQuery#exactSearch` is better with exact 
search since it uses a conjunctive `DocIdSetIterator` rather than iterating on 
all the docs?. If yes, then for this we could maybe simply add an `else if` 
condition in VectorsReader to straightaway overwhelm the collector (forcing its 
`earlyTerminated` to return true) and return so it automatically fallsback to 
best exactSearch impl best of both worlds)
   ```
       else if (getGraph(fieldEntry).equals(HnswGraph.EMPTY)) {
         // MakesFallback to exactSearch directly
         knnCollector.incVisitedCount((int) knnCollector.visitLimit() + 1);
       }
   ```
   
   Let me know your thoughts or if I'm missing something here. Thanks!
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to