rmuir commented on issue #13271: URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038169885
i'll try to dig into it to at least find the offending component. If we can narrow it down to the problematic charfilter, tokenizer, or tokenfilter, we can make an easier-to-reproduce case. In the past I've done this by creating a manual test (think, its a custom analyzer of the exact components printed out) that consumes the exact string and added it to "TestBugInSomething", until I can whittle it down. Gonna need to move TestBugInSomething.java to the integration tests (`analysis.tests`) alongside TestRandomChains, so we can do this with failures that involve multiple analysis modules. The fact that it only reproduces some of the time is also annoying and possibly a separate bug in the test of its own... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org