mikemccand commented on issue #13867: URL: https://github.com/apache/lucene/issues/13867#issuecomment-2400880851
OK I checked out git tag `releases/lucene/9.11.1` and made this small diff: ``` --- a/lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestInt8HnswBackwardsCompatibility.java +++ b/lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestInt8HnswBackwardsCompatibility.java @@ -117,7 +116,7 @@ public class TestInt8HnswBackwardsCompatibility extends BackwardsCompatibilityTe IndexWriterConfig conf = new IndexWriterConfig(new MockAnalyzer(random())) .setMaxBufferedDocs(10) - .setCodec(TestUtil.getDefaultCodec()) + .setCodec(getCodec()) .setMergePolicy(NoMergePolicy.INSTANCE); try (IndexWriter writer = new IndexWriter(dir, conf)) { for (int i = 0; i < DOC_COUNT; i++) { ``` I think that explains why the bwc indices did not in fact test `int8` (nor `int7`) quantization ... and why the bwc tests then did not fail with my original PR. It makes me wonder what other bwc indices are in fact not testing what they/we think they are testing because they used the default codec ... This is once again the dreaded "who tests the tester!" problem. Turtles all the way down ... With the above diff, I then ran this command (still in 9.11.1 clone) to regenerate all bwc indices: ``` ./gradlew test -Ptests.bwcdir=/l/9111/tmp -Ptests.useSecurityManager=false --tests TestGenerateBwcIndices -Dtests.verbose=true --max-workers=1 ``` Then, I copied the newly generated `int8_hnsw.9.11.1.zip` into my `9.12.x` clone's `./lucene/backward-codecs/src/test/org/apache/lucene/backward_index/int8_hnsw.9.11.1.zip` and re-ran `./gradlew test --tests TestInt8HnswBackwardsCompatibility` and now it fails (phew!) with: ``` > java.lang.IllegalStateException: Quantized vector data length 70 not matching size=10 * (dim=3 + 4) = 60 > at org.apache.lucene.core@9.12.0-SNAPSHOT/org.apache.lucene.codecs.lucene99.Lucene99ScalarQuantizedVectorsReader.validateFieldEntry(Lucene99Sc\ alarQuantizedVectorsReader.java:149) > at org.apache.lucene.core@9.12.0-SNAPSHOT/org.apache.lucene.codecs.lucene99.Lucene99ScalarQuantizedVectorsReader.readFields(Lucene99ScalarQuan\ tizedVectorsReader.java:121) > at org.apache.lucene.core@9.12.0-SNAPSHOT/org.apache.lucene.codecs.lucene99.Lucene99ScalarQuantizedVectorsReader.<init>(Lucene99ScalarQuantize\ dVectorsReader.java:90) > at org.apache.lucene.core@9.12.0-SNAPSHOT/org.apache.lucene.codecs.lucene99.Lucene99ScalarQuantizedVectorsFormat.fieldsReader(Lucene99ScalarQu\ antizedVectorsFormat.java:160) > at org.apache.lucene.core@9.12.0-SNAPSHOT/org.apache.lucene.codecs.lucene99.Lucene99HnswScalarQuantizedVectorsFormat.fieldsReader(Lucene99Hnsw\ ScalarQuantizedVectorsFormat.java:155) > at org.apache.lucene.core@9.12.0-SNAPSHOT/org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat$FieldsReader.<init>(PerFieldKnnVectorsFor\ mat.java:222) > at org.apache.lucene.core@9.12.0-SNAPSHOT/org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat.fieldsReader(PerFieldKnnVectorsFormat.jav\ ... ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org