rmuir commented on PR #11916: URL: https://github.com/apache/lucene/pull/11916#issuecomment-1310961095
@benwtrent I hit issue upon backporting to branch_9x: it may be nothing specific to 9.x but just a random seed that hasn't been encountered yet on master? The checkindex error message is a bit generic, maybe we can improve it, as I can't tell immediately what went wrong. It seems that it got 0 results when searching for nearest neighbors on a vector... but I'm guessing this happens because we don't check Bits for deleted docs before issuing the query? If we search on a vector that only exists in a deleted doc, its possible all the results could be deleted ones too. So I think we are just missing an "if statement" before issuing the query? ``` think:lucene[branch_9x]$ git cherry-pick 3a506ec87a01556a530eee5eb54ada49fe3cde3f Auto-merging lucene/CHANGES.txt [branch_9x cfbb7e9bd35] GITHUB#11911: improve checkindex to be more thorough for vectors (#11916) Author: Benjamin Trent <ben.w.tr...@gmail.com> Date: Thu Nov 10 16:45:47 2022 -0500 2 files changed, 31 insertions(+), 6 deletions(-) think:lucene[branch_9x]$ ./gradlew check ... > Task :lucene:core:test org.apache.lucene.codecs.lucene94.TestLucene94HnswVectorsFormat > testRandomBytes FAILED org.apache.lucene.index.CheckIndex$CheckIndexException: Field "field" failed to search k nearest neighbors at __randomizedtesting.SeedInfo.seed([45F1ECAB3ABC3C78:8EF8167273CC4F95]:0) at app//org.apache.lucene.index.CheckIndex.testVectors(CheckIndex.java:2603) at app//org.apache.lucene.index.CheckIndex.testSegment(CheckIndex.java:1011) at app//org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:710) at app//org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:548) at app//org.apache.lucene.tests.util.TestUtil.checkIndex(TestUtil.java:343) at app//org.apache.lucene.tests.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:909) at app//org.apache.lucene.tests.index.BaseKnnVectorsFormatTestCase.testRandomBytes(BaseKnnVectorsFormatTestCase.java:975) at java.base@17.0.5/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base@17.0.5/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base@17.0.5/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base@17.0.5/java.lang.reflect.Method.invoke(Method.java:568) at app//com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996) at app//org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48) at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at app//org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45) at app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) at app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) at app//org.junit.rules.RunRules.evaluate(RunRules.java:20) at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490) at app//com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891) at app//com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902) at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at app//org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at app//org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) at app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) at app//org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47) at app//org.junit.rules.RunRules.evaluate(RunRules.java:20) at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) at app//com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850) at java.base@17.0.5/java.lang.Thread.run(Thread.java:833) org.apache.lucene.codecs.lucene94.TestLucene94HnswVectorsFormat > test suite's output saved to /home/rmuir/workspace/lucene/lucene/core/build/test-results/test/outputs/OUTPUT-org.apache.lucene.codecs.lucene94.TestLucene94HnswVectorsFormat.txt, copied below: 1> [_0.cfe, _0.cfs, _0.si, segments_2] > org.apache.lucene.index.CheckIndex$CheckIndexException: Field "field" failed to search k nearest neighbors > at __randomizedtesting.SeedInfo.seed([45F1ECAB3ABC3C78:8EF8167273CC4F95]:0) > at app//org.apache.lucene.index.CheckIndex.testVectors(CheckIndex.java:2603) > at app//org.apache.lucene.index.CheckIndex.testSegment(CheckIndex.java:1011) > at app//org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:710) > at app//org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:548) > at app//org.apache.lucene.tests.util.TestUtil.checkIndex(TestUtil.java:343) > at app//org.apache.lucene.tests.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:909) > at app//org.apache.lucene.tests.index.BaseKnnVectorsFormatTestCase.testRandomBytes(BaseKnnVectorsFormatTestCase.java:975) > at java.base@17.0.5/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base@17.0.5/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at java.base@17.0.5/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base@17.0.5/java.lang.reflect.Method.invoke(Method.java:568) > at app//com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758) > at app//com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946) > at app//com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982) > at app//com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996) > at app//org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48) > at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) > at app//org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45) > at app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) > at app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) > at app//org.junit.rules.RunRules.evaluate(RunRules.java:20) > at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) > at app//com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843) > at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490) > at app//com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955) > at app//com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840) > at app//com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891) > at app//com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902) > at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) > at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at app//org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) > at app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) > at app//com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) > at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at app//org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) > at app//org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) > at app//org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) > at app//org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) > at app//org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47) > at app//org.junit.rules.RunRules.evaluate(RunRules.java:20) > at app//com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at app//com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) > at app//com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850) > at java.base@17.0.5/java.lang.Thread.run(Thread.java:833) 2> NOTE: reproduce with: gradlew test --tests TestLucene94HnswVectorsFormat.testRandomBytes -Dtests.seed=45F1ECAB3ABC3C78 -Dtests.badapples=true -Dtests.locale=es-BO -Dtests.timezone=America/Fortaleza -Dtests.asserts=true -Dtests.file.encoding=UTF-8 2> NOTE: leaving temporary files on disk at: /tmp/lucene_gradle/lucene.codecs.lucene94.TestLucene94HnswVectorsFormat_45F1ECAB3ABC3C78-001 2> NOTE: test params are: codec=FastCompressingStoredFieldsData(storedFieldsFormat=Lucene90CompressingStoredFieldsFormat(compressionMode=FAST, chunkSize=26452, maxDocsPerChunk=3, blockShift=16), termVectorsFormat=Lucene90CompressingTermVectorsFormat(compressionMode=FAST, chunkSize=26452, maxDocsPerChunk=3, blockSize=16)), sim=Asserting(RandomSimilarity(queryNorm=true): {}), locale=es-BO, timezone=America/Fortaleza 2> NOTE: Linux 6.0.7-arch1-1 amd64/N/A 17.0.5 (64-bit)/cpus=1,threads=1,free=398185920,total=534773760 2> NOTE: All tests run in this JVM: [TestAssertions, TestSearch, TestAnalyzerWrapper, TestCharArraySet, TestGraphTokenFilter, TestWordlistLoader, TestCharTermAttributeImpl, TestCodecUtil, TestFastCompressionMode, TestForUtil, TestLucene90DocValuesFormatMergeInstance, TestLucene90NormsFormatMergeInstance, TestLucene90StoredFieldsFormat, TestPForUtil, TestLucene94HnswVectorsFormat] ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org