pmpailis opened a new pull request, #14882: URL: https://github.com/apache/lucene/pull/14882
When specifying a `knn` query with a custom filter, it is possible to be met with the following IOOB exception ``` Range [0, 0 + 65536) out of bounds for length 10083 java.lang.IndexOutOfBoundsException: Range [0, 0 + 65536) out of bounds for length 10083 at __randomizedtesting.SeedInfo.seed([D5F220708924BDAF:C71838EC9B6B347C]:0) at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:100) at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckFromIndexSize(Preconditions.java:118) at java.base/jdk.internal.util.Preconditions.checkFromIndexSize(Preconditions.java:397) at java.base/java.util.Objects.checkFromIndexSize(Objects.java:417) at org.apache.lucene.util.FixedBitSet.orRange(FixedBitSet.java:402) at org.apache.lucene.codecs.lucene90.IndexedDISI$Method$2.intoBitSetWithinBlock(IndexedDISI.java:731) at org.apache.lucene.codecs.lucene90.IndexedDISI.intoBitSet(IndexedDISI.java:477) at org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$21.intoBitSet(Lucene90DocValuesProducer.java:1022) at org.apache.lucene.index.SingletonSortedSetDocValues.intoBitSet(SingletonSortedSetDocValues.java:93) at org.apache.lucene.util.FixedBitSet.or(FixedBitSet.java:380) at org.apache.lucene.search.AbstractKnnVectorQuery.createBitSet(AbstractKnnVectorQuery.java:234) at org.apache.lucene.search.AbstractKnnVectorQuery.getLeafResults(AbstractKnnVectorQuery.java:194) at org.apache.lucene.search.AbstractKnnVectorQuery.searchLeaf(AbstractKnnVectorQuery.java:168) at org.apache.lucene.search.AbstractKnnVectorQuery.lambda$rewrite$0(AbstractKnnVectorQuery.java:108) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:328) ``` This seems to happen because when creating a `FixedBitSet` with `maxDoc` size in `AbstractKnnVectorQuery#createBitSet` ``` FixedBitSet bitSet = new FixedBitSet(maxDoc); ``` we end up providing an `upTo` value of `DocIdSetIterator.NO_MORE_DOCS` through the `or`. So, when we reach `FixedBitSet.orRange` we try to verify the `dest` bitset based on `BLOCK_SIZE` size, even though the actual size is usually less than that, causing the above IIOB exception. The issue is reproducible for `IndexedDISI.Method.Dense#intoBitSetWithinBlock` only, and not for `Sparse`. In this PR, we update the upper bound provided to `FixedBitSet.orRange` to take into account the `bitSet.length()` if `upTo == NO_MORE_DOCS`. If this seems too broad, we can narrow down the fix specifically to the `AbstractKnnVectorQuery` case, by providing a well defined `upTo` to `or`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org