uschindler commented on PR #12582: URL: https://github.com/apache/lucene/pull/12582#issuecomment-1765355363
> > why do we need a new top-level Codec? The Lucene main file format does not change, only the HNSW format was exchanged. Because like ppostingsfornats and docvaluesformats, the SPI can detect the format of the HNSW index by reading the file and uses SPI to lookup the correct format. > > That's a good point. I think we'd need to increment the VERSION_CURRENT of the Lucene95HnswVectorsFormat to do the right thing when reading the data and we could avoid the new format entirely since it's exactly the same as before (assuming that quantisation is disabled by default). Actually, if the HNSW format has its own SPI name, when reading indexes it should be chosen automatically by KNNVectorsFormat.forName(): https://lucene.apache.org/core/9_0_0/core/org/apache/lucene/codecs/KnnVectorsFormat.html?is-external=true#forName(java.lang.String) In short: when top level codec reads the index and opens the vector format it would read the SPI name header from file and then load the correct code (possibly the actual one or knew from backwards). That's working like that for years for postings and docvalues. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org