benwtrent opened a new issue, #12895: URL: https://github.com/apache/lucene/issues/12895
### Description It seems that https://github.com/apache/lucene/pull/12699/ has inadvertantly broken reading term dictionaries created in Lucene 9.8<=. To replicate a bug, one can index wikibigall with LuceneUtil & Lucene 9.8 & force-merge. Then attempt to read the created index using a wildcard query: ``` Path path = Paths.get("/data/local/lucene/indices/wikibigall.lucene-main.opt.Lucene90.dvfields.nd6.72652M/index"); try (FSDirectory dir = FSDirectory.open(path); DirectoryReader reader = DirectoryReader.open(dir)) { IndexSearcher searcher = new IndexSearcher(reader); searcher.count(new WildcardQuery(new Term("body", "*fo*"))); } ``` This will result in a trace similar to below: ``` Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 3 out of bounds for length 3 at org.apache.lucene.store.ByteArrayDataInput.readByte(ByteArrayDataInput.java:136) at org.apache.lucene.store.DataInput.readVInt(DataInput.java:110) at org.apache.lucene.store.ByteArrayDataInput.readVInt(ByteArrayDataInput.java:114) at org.apache.lucene.codecs.lucene90.blocktree.IntersectTermsEnumFrame.load(IntersectTermsEnumFrame.java:158) at org.apache.lucene.codecs.lucene90.blocktree.IntersectTermsEnumFrame.load(IntersectTermsEnumFrame.java:149) at org.apache.lucene.codecs.lucene90.blocktree.IntersectTermsEnum.pushFrame(IntersectTermsEnum.java:203) at org.apache.lucene.codecs.lucene90.blocktree.IntersectTermsEnum._next(IntersectTermsEnum.java:531) at org.apache.lucene.codecs.lucene90.blocktree.IntersectTermsEnum.next(IntersectTermsEnum.java:373) at org.apache.lucene.search.MultiTermQueryConstantScoreBlendedWrapper$1.rewriteInner(MultiTermQueryConstantScoreBlendedWrapper.java:111) at org.apache.lucene.search.AbstractMultiTermQueryConstantScoreWrapper$RewritingWeight.rewrite(AbstractMultiTermQueryConstantScoreWrapper.java:179) at org.apache.lucene.search.AbstractMultiTermQueryConstantScoreWrapper$RewritingWeight.bulkScorer(AbstractMultiTermQueryConstantScoreWrapper.java:220) at org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:930) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:678) at org.apache.lucene.search.IndexSearcher.lambda$4(IndexSearcher.java:636) at org.apache.lucene.search.TaskExecutor$TaskGroup.lambda$0(TaskExecutor.java:118) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317) at org.apache.lucene.search.TaskExecutor$TaskGroup.invokeAll(TaskExecutor.java:153) at org.apache.lucene.search.TaskExecutor.invokeAll(TaskExecutor.java:76) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:640) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:607) at org.apache.lucene.search.IndexSearcher.count(IndexSearcher.java:423) at Corruption.main(Corruption.java:18) ``` We are currently not sure if this effects Lucene 9.9 created indices & reading via Lucene 9.9. NOTE: This also fails with just a prefix wildcard query. It seems to be all multi-term queries could be affected. Will provide more example stack traces in issue comments. ### Version and environment details Lucene 9.9 reading Lucene 9.8 indices. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org