hossman opened a new issue, #15540: URL: https://github.com/apache/lucene/issues/15540
### Description `IndexWriter` will happily allow applications to index documents containing `KnnByteVectorField` (and presumably `KnnFloatVectorField`) instances containing "invalid" values. This invalid vectors will not trigger an Exception from either `IndexWriter.addDocument()` nor `IndexWriter.commit()` -- they will only cause problem down the road during index merges, or when running CheckIndex. A trivial test case can be found in: [lucene.invalid-vector-indexing-with-out-failure.test.patch](https://github.com/user-attachments/files/24375789/lucene.invalid-vector-indexing-with-out-failure.test.patch) (which uses COSINE sim and indexes `new byte[] {0,0,0,...,0}`... I'm not sure if similar problems will happen with other vector+sim combos and/or non-normalized vectors when using DOC_PRODUCT?) AFAIK this test should fail on any system regardless of seed. The nature of the failure can be changed by modifying `tests.asserts` to influence whether: ### The problem triggers an assertion in in the `KnnVectorsWriter.merge` call stack. ``` > java.lang.AssertionError: Nodes are added in the incorrect order! Comparing NaN to [1.0] > at __randomizedtesting.SeedInfo.seed([75D48A8DF27FDF07:8BF93348CADA4842]:0) > at org.apache.lucene.util.hnsw.NeighborArray.addInOrder(NeighborArray.java:80) > at org.apache.lucene.util.hnsw.HnswGraphBuilder.popToScratch(HnswGraphBuilder.java:461) > at org.apache.lucene.util.hnsw.HnswGraphBuilder.addGraphNodeInternal(HnswGraphBuilder.java:286) > at org.apache.lucene.util.hnsw.HnswGraphBuilder.addGraphNode(HnswGraphBuilder.java:325) > at org.apache.lucene.util.hnsw.MergingHnswGraphBuilder.updateGraph(MergingHnswGraphBuilder.java:153) > at org.apache.lucene.util.hnsw.MergingHnswGraphBuilder.build(MergingHnswGraphBuilder.java:128) > at org.apache.lucene.util.hnsw.IncrementalHnswGraphMerger.merge(IncrementalHnswGraphMerger.java:214) > at org.apache.lucene.codecs.lucene99.Lucene99HnswVectorsWriter.mergeOneField(Lucene99HnswVectorsWriter.java:444) > at org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat$FieldsWriter.mergeOneField(PerFieldKnnVectorsFormat.java:128) > at org.apache.lucene.codecs.KnnVectorsWriter.merge(KnnVectorsWriter.java:105) > at org.apache.lucene.index.SegmentMerger.mergeVectorValues(SegmentMerger.java:272) > at org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:315) > at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:159) > at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5276) > at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4739) > at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:6538) > at org.apache.lucene.index.SerialMergeScheduler.merge(SerialMergeScheduler.java:38) > at org.apache.lucene.index.IndexWriter.executeMerge(IndexWriter.java:2333) > at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2328) > at org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:2163) > at org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:2111) > at org.apache.lucene.util.hnsw.TestZeroVectorHnswGraphIndexing.testIndexingAndMerging(TestZeroVectorHnswGraphIndexing.java:53) ... Reproduce with: gradlew :lucene:core:test --tests "org.apache.lucene.util.hnsw.TestZeroVectorHnswGraphIndexing.testIndexingAndMerging" -Ptests.asserts=true -Ptests.file.encoding=ISO-8859-1 -Ptests.gui=false "-Ptests.jvmargs=-XX:TieredStopAtLevel=1 -XX:+UseParallelGC -XX:ActiveProcessorCount=1" -Ptests.jvms=5 -Ptests.seed=75D48A8DF27FDF07 -Ptests.vectorsize=512 ``` ### OR ... The problem sneaks through all indexing & merging and only causes an issue when `MockDirectoryWrapper.close()` invokes `CheckIndex` ``` > org.apache.lucene.index.CheckIndex$CheckIndexException: Field "bytes" failed to search k nearest neighbors > at __randomizedtesting.SeedInfo.seed([9F1658AC2C860676:613BE16914239133]:0) > at app//org.apache.lucene.index.CheckIndex.checkByteVectorValues(CheckIndex.java:3162) > at app//org.apache.lucene.index.CheckIndex.testVectors(CheckIndex.java:2855) > at app//org.apache.lucene.index.CheckIndex.testSegment(CheckIndex.java:1123) > at app//org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:823) > at app//org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:593) > at app//org.apache.lucene.tests.util.TestUtil.checkIndex(TestUtil.java:333) > at app//org.apache.lucene.tests.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:917) > at app//org.apache.lucene.util.hnsw.TestZeroVectorHnswGraphIndexing.testIndexingAndMerging(TestZeroVectorHnswGraphIndexing.java:56) ... Reproduce with: gradlew :lucene:core:test --tests "org.apache.lucene.util.hnsw.TestZeroVectorHnswGraphIndexing.testIndexingAndMerging" -Ptests.asserts=false -Ptests.file.encoding=US-ASCII -Ptests.gui=false "-Ptests.jvmargs=-XX:TieredStopAtLevel=1 -XX:+UseParallelGC -XX:ActiveProcessorCount=1" -Ptests.jvms=5 -Ptests.seed=9F1658AC2C860676 -Ptests.vectorsize=128 ``` ### Version and environment details This problem affects `main`, and `branch_10x`, back (at least) as far as `10.3.2` where it was discovered due to a randomized Solr test that could inadvertently generate an "all zero" vector ([SOLR-17736](https://issues.apache.org/jira/browse/SOLR-17736)) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
