mikemccand commented on issue #14758: URL: https://github.com/apache/lucene/issues/14758#issuecomment-3606464509
> Could we do something structurally to HNSW when the documents are sorted? +1 -- these sound like cool ideas! We do similar "do things faster if index is sorted" e.g. if query sorts by the same doc values field as index? Or ... if query has a filter on the doc values field use to sort the index? But, those are query-time optos; your idea is to improve the HNSW graph construction, knowing the sort order of the docs, which is neat. But, this would limit the HNSW graph optimizing to one sort criteria? Whereas filters in general (and @kaivalnp's proposal in #15440) might be about simple multi-tenancy, but might be about other things involving other fields than the index sort. At Amazon (disclaimer: both @kaivalnp and I work at Amazon, using Lucene to power our customer facing search over the catalog, it's fun!), our current struggle is searching within one category or sub-tree of the catalog. E.g. "autos" (yes, [you can now even buy cars on Amazon](https://www.amazon.com/Amazon-Autos/b?ie=UTF8&node=10677469011)!) is a tiny tiny sliver of the catalog, and filtered KNN search on that sliver is horrible today. But we sort our index by a measure of ASIN engagement, so if we early-terminate, we will have evaluated the statically "best" ASINs. So ideally the choice of static index sort, and performant index-time HNSW filtering, would be independent. Maybe open a new issue with your "try harder to connect together my local neighborhood when index is sorted" idea? It seems independently possibly needle moving. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
