jed326 opened a new issue, #14025: URL: https://github.com/apache/lucene/issues/14025
### Description Today I see 2 ways to provide the distance calculations when using the HNSW vectors in Lucene: 1. The existing `VectorSimilarityFunction`, which is encoded into the segment file itself. 2. Via a customer scorer through a custom `KnnVectorsFormat`. IMO this is not a great experience because in order to provide my own scorer I need to implement at least 2 new classes but for the most part the code in those classes would be boilerplate/duplicated code. In fact really the only novel code there would be in `RandomVectorScorer#score`. I do see that we're a little bit stuck with this because the existing `VectorSimilarityFunction` class is implemented as an enum so we can't extend it (or really make any changes to it). I see that adding bit/binary vector support (https://github.com/apache/lucene/issues/13505) is also currently blocked on resolving this, so I wanted to ask: 1. What's the remaining gap to officially supporting bit vectors in Lucene? Naively it looks as simple as moving the new `HnswBitVectorsFormat` class introduced in #13288 into the `lucene101` package. 2. Broadly speaking what is the vision here for allowing users to customize the distance calculations? For example does the current approach with implementing a custom format/scorer look like the longer term strategy or instead the long term plan look something like replacing `VectorSimilarityFunction` with an extensible interface instead? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org