mikemccand commented on issue #13867:
URL: https://github.com/apache/lucene/issues/13867#issuecomment-2400880851

   OK I checked out git tag `releases/lucene/9.11.1` and made this small diff:
   
   ```
   --- 
a/lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestInt8HnswBackwardsCompatibility.java
   +++ 
b/lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestInt8HnswBackwardsCompatibility.java
   @@ -117,7 +116,7 @@ public class TestInt8HnswBackwardsCompatibility extends 
BackwardsCompatibilityTe
        IndexWriterConfig conf =
            new IndexWriterConfig(new MockAnalyzer(random()))
                .setMaxBufferedDocs(10)
   -            .setCodec(TestUtil.getDefaultCodec())
   +            .setCodec(getCodec())
                .setMergePolicy(NoMergePolicy.INSTANCE);
        try (IndexWriter writer = new IndexWriter(dir, conf)) {
          for (int i = 0; i < DOC_COUNT; i++) {
   ```
   
   I think that explains why the bwc indices did not in fact test `int8` (nor 
`int7`) quantization ... and why the bwc tests then did not fail with my 
original PR.  It makes me wonder what other bwc indices are in fact not testing 
what they/we think they are testing because they used the default codec ...
   
   This is once again the dreaded "who tests the tester!" problem.  Turtles all 
the way down ...
   
   With the above diff, I then ran this command (still in 9.11.1 clone) to 
regenerate all bwc indices:
   
   ```
   ./gradlew test -Ptests.bwcdir=/l/9111/tmp -Ptests.useSecurityManager=false 
--tests TestGenerateBwcIndices -Dtests.verbose=true --max-workers=1
   ```
   
   Then, I copied the newly generated `int8_hnsw.9.11.1.zip` into my `9.12.x` 
clone's 
`./lucene/backward-codecs/src/test/org/apache/lucene/backward_index/int8_hnsw.9.11.1.zip`
 and re-ran `./gradlew test --tests TestInt8HnswBackwardsCompatibility` and now 
it fails (phew!) with:
   
   ```
      >     java.lang.IllegalStateException: Quantized vector data length 70 
not matching size=10 * (dim=3 + 4) = 60
      >         at 
org.apache.lucene.core@9.12.0-SNAPSHOT/org.apache.lucene.codecs.lucene99.Lucene99ScalarQuantizedVectorsReader.validateFieldEntry(Lucene99Sc\
   alarQuantizedVectorsReader.java:149)
      >         at 
org.apache.lucene.core@9.12.0-SNAPSHOT/org.apache.lucene.codecs.lucene99.Lucene99ScalarQuantizedVectorsReader.readFields(Lucene99ScalarQuan\
   tizedVectorsReader.java:121)
      >         at 
org.apache.lucene.core@9.12.0-SNAPSHOT/org.apache.lucene.codecs.lucene99.Lucene99ScalarQuantizedVectorsReader.<init>(Lucene99ScalarQuantize\
   dVectorsReader.java:90)
      >         at 
org.apache.lucene.core@9.12.0-SNAPSHOT/org.apache.lucene.codecs.lucene99.Lucene99ScalarQuantizedVectorsFormat.fieldsReader(Lucene99ScalarQu\
   antizedVectorsFormat.java:160)
      >         at 
org.apache.lucene.core@9.12.0-SNAPSHOT/org.apache.lucene.codecs.lucene99.Lucene99HnswScalarQuantizedVectorsFormat.fieldsReader(Lucene99Hnsw\
   ScalarQuantizedVectorsFormat.java:155)
      >         at 
org.apache.lucene.core@9.12.0-SNAPSHOT/org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat$FieldsReader.<init>(PerFieldKnnVectorsFor\
   mat.java:222)
      >         at 
org.apache.lucene.core@9.12.0-SNAPSHOT/org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat.fieldsReader(PerFieldKnnVectorsFormat.jav\
      ...
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to