benwtrent opened a new issue, #15471: URL: https://github.com/apache/lucene/issues/15471
### Description When index sorting is configured, its generally assumed that the search requests will be filtered according to that sort criteria. Currently, HNSW is "filter agnostic", and does nothing around handling connections that may or may not be included in some future filter. Could we add additional "biased" connections within the lowest level of the HNSW graph? These biased connections would be in addition to the true nearest neighbor connections, but would be required to be within some ordinal range. For example, consider vector ordinal `0`, which is connected to the traditionally nearest neighbors of `148, 511, 513, 1044, 1667`. In addition to these connections, we have a restriction that each ordinal must consider nearest neighbors within an ordinal range of `100`. Now vector `0` will have new connections `10, 11, 56, 148, 511, 513, 1044, 1667`. Since the index is sorted according to a commonly filtered criteria, it is likely that ordinals that are near each other would also pass the same filters (I suppose it depends on cardinality, etc....). QDrant does something sort of like this, but they have the luxury of knowing filter criteria directly: (https://qdrant.tech/articles/filtrable-hnsw/) I think we can substitute index sorting to apply additional connections in a similar, but maybe not as robust way. Combined with ACORN (which would pair very well with this), this could really help heavily filtered search when the criteria matches the index sort. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
