benwtrent commented on PR #13401: URL: https://github.com/apache/lucene/pull/13401#issuecomment-2176636564
Wanted to touch base on this PR as it seems to have been stalled, mainly by me. The only format that would support pluggable similarities would be `Lucene99HnswVectorsFormat`. Any of the quantized codecs would have to throw an exception on an unknown similarity name. This now complicates a user's mental model support matrix. Having to consider not only codecs, but similarities, all of which are pluggable. This inherit complexity of making things pluggable is why I think the implication of the "pluggable similarities" is "just make your own format". However, I am not against moving away from `enum` and moving towards a nominal/id set of core interfaces. Enums are notoriously painful for BWC as removing one adjusts its "id" and now various edge's have to be smoothed out all over the place. All this talk of a pluggable SPI for vector similarities spawned out of the complexities of adding fully BWC similarity functions and the difficulty of deprecating and moving on. So, I propose: - We deprecate cosine (as we already have) and remove it from being writable in v10 - Move away from enums to an id/nominal system for the similarities (what this PR could do) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org