mikemccand commented on PR #12908: URL: https://github.com/apache/lucene/pull/12908#issuecomment-1850181921
> Hmmm... I agree we can't expect `BasePostingsFormatTestCase` to catch all bw compat problems, but the `TestLucene90PostingsFormat` from this PR writes data in the 9.8 format of the terms dictionary, and then reads the data with the 9.9 logic, only relying on version checks to read outputs with regular vlongs. So it should be exposing the bug? Oh! I didn't realize it did this, OK. Then indeed it has a chance of uncovering the bug, but failed to do so :( Note that the bug was quite hard to repro -- I started with various random simple strings, than realistic unicode strings, varying counts, term lengths, etc.. (using one Java tool to write the index using 9.8.0, and another to randomly search the index using 9.9.0), but bug would not repro after a great many iterations. So then I switched to contiguous slices of `wikibigall` terms, and had to also randomize the `WildcardQuery` substring, and finally it repros quite readily. Too many of our tests use unrealistic random terms ... realistic term distributions tickle the many `if` statements better. Does `BasePostingsFormatTestCase` index from `LineFileDocs`? If so, the nightly line docs file is quite a bit beefier and may have a better shot at repro? (And they are somewhat extracted from `enwiki` text...). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org