benwtrent opened a new issue, #15471:
URL: https://github.com/apache/lucene/issues/15471

   ### Description
   
   When index sorting is configured, its generally assumed that the search 
requests will be filtered according to that sort criteria. 
   
   Currently, HNSW is "filter agnostic", and does nothing around handling 
connections that may or may not be included in some future filter. 
   
   Could we add additional "biased" connections within the lowest level of the 
HNSW graph?  These biased connections would be in addition to the true nearest 
neighbor connections, but would be required to be within some ordinal range. 
   
   For example, consider vector ordinal `0`, which is connected to the 
traditionally nearest neighbors of `148, 511, 513, 1044, 1667`. In addition to 
these connections, we have a restriction that each ordinal must consider 
nearest neighbors within an ordinal range of `100`. Now vector `0` will have 
new connections `10, 11, 56, 148, 511, 513, 1044, 1667`. 
   
   Since the index is sorted according to a commonly filtered criteria, it is 
likely that ordinals that are near each other would also pass the same filters 
(I suppose it depends on cardinality, etc....). 
   
   QDrant does something sort of like this, but they have the luxury of knowing 
filter criteria directly: (https://qdrant.tech/articles/filtrable-hnsw/)
   
   I think we can substitute index sorting to apply additional connections in a 
similar, but maybe not as robust way. 
   
   Combined with ACORN (which would pair very well with this), this could 
really help heavily filtered search when the criteria matches the index sort.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to