cbuescher opened a new pull request #1073: LUCENE-9088: JapaneseNumberFilter uses inaccurate PartOfSpeechAttribute URL: https://github.com/apache/lucene-solr/pull/1073 Currently the JapaneseNumberFilter reads past a single or multiple numeric tokens and emits the new composed token with the attributes of the following token. This will often lead to e.g. wrong part-of-speech attributes on the numeric token, which in turn can lead to wrong filtering by subsequent filters. This change keeps track of the state of the last numeric token while iterating over a number group and restores the last seen state before emiting the composed numeric token, so we use the attributes of the last one.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org