[ https://issues.apache.org/jira/browse/LUCENE-9853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tomoko Uchida updated LUCENE-9853: ---------------------------------- Description: Follow-up issue of LUCENE-9413. We now have CJKWidthCharFilter in analyzers-common. I believe in many situations it is recommended applying half-width/full-width character normalization before tokenization for consistency in analysis. The change slightly affects on the analyzer's outputs. We can provide a parameter to switch back to CJKWidthFilter for backward compatibility. was: Follow-up issue of LUCENE-9413. We now have CJKWidthCharFilter in analyzers-common. I believe in many situations it is recommended applying character width normalization before tokenization for consistency in analysis. The change slightly affects on the analyzer's outputs. We can provide a parameter to switch back to CJKWidthFilter for backward compatibility. > Use CJKWidthCharFilter as the default character normalizer for > JapaneseAnalyzer instead of CJKWidthFilter > --------------------------------------------------------------------------------------------------------- > > Key: LUCENE-9853 > URL: https://issues.apache.org/jira/browse/LUCENE-9853 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis > Affects Versions: main (9.0) > Reporter: Tomoko Uchida > Assignee: Tomoko Uchida > Priority: Minor > > Follow-up issue of LUCENE-9413. > We now have CJKWidthCharFilter in analyzers-common. I believe in many > situations it is recommended applying half-width/full-width character > normalization before tokenization for consistency in analysis. > The change slightly affects on the analyzer's outputs. We can provide a > parameter to switch back to CJKWidthFilter for backward compatibility. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org