mikemccand commented on issue #14758:
URL: https://github.com/apache/lucene/issues/14758#issuecomment-3606464509

   > Could we do something structurally to HNSW when the documents are sorted? 
   
   +1 -- these sound like cool ideas!  We do similar "do things faster if index 
is sorted" e.g. if query sorts by the same doc values field as index?  Or ... 
if query has a filter on the doc values field use to sort the index?  But, 
those are query-time optos; your idea is to improve the HNSW graph 
construction, knowing the sort order of the docs, which is neat.
   
   But, this would limit the HNSW graph optimizing to one sort criteria?  
Whereas filters in general (and @kaivalnp's proposal in #15440) might be about 
simple multi-tenancy, but might be about other things involving other fields 
than the index sort.  At Amazon (disclaimer: both @kaivalnp and I work at 
Amazon, using Lucene to power our customer facing search over the catalog, it's 
fun!), our current struggle is searching within one category or sub-tree of the 
catalog.  E.g. "autos" (yes, [you can now even buy cars on 
Amazon](https://www.amazon.com/Amazon-Autos/b?ie=UTF8&node=10677469011)!) is a 
tiny tiny sliver of the catalog, and filtered KNN search on that sliver is 
horrible today.  But we sort our index by a measure of ASIN engagement, so if 
we early-terminate, we will have evaluated the statically "best" ASINs.  So 
ideally the choice of static index sort, and performant index-time HNSW 
filtering, would be independent.
   
   Maybe open a new issue with your "try harder to connect together my local 
neighborhood when index is sorted" idea?  It seems independently possibly 
needle moving.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to