kotman12 opened a new issue, #11771:
URL: https://github.com/apache/lucene/issues/11771

   ### Description
   
   KeywordRepeatFilter + OpenNLPLLemmatizer leads to arbitrarily early exit of 
token stream.
   
   Steps to reproduce: run this 
[test](https://github.com/kotman12/lucene/blob/illustrate-bug/lucene/analysis/opennlp/src/test/org/apache/lucene/analysis/opennlp/TestOpenNLPLemmatizerFilterFactory.java#L324)
 and notice how no text below [this line from the test 
file](https://github.com/kotman12/lucene/blob/illustrate-bug/lucene/analysis/opennlp/src/test-files/org/apache/lucene/analysis/opennlp/data/early-exit-bug-input.txt#L20)
 gets analyzed.
   
   The root cause appears to be [an extraneous exit 
condition](https://github.com/kotman12/lucene/blob/illustrate-bug/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/OpenNLPLemmatizerFilter.java#L75)
 that doesn't play nicely with KeywordRepeatFilter.
   
   This is related to the bug #11735 and is addressed by #11734 
   
   ### Version and environment details
   
   latest version of lucene running jdk-17


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to