[
https://issues.apache.org/jira/browse/LUCENE-10008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17365116#comment-17365116
]
Vigya Sharma commented on LUCENE-10008:
---------------------------------------
Thanks Chris.
The base class for {{\{CommonGrams/Stop/KeepWord}FilterFactory}}, is the
TokenFilterFactory, which it seems is extended by other non-English token
filters as well, e.g. Arabic or Bengali StemFilterFactory classes.
Should we add a new base class common for {{Stop/KeepWord/CommonGrams}} to
parse these args (something like {{EnglishAnalyzerFilterFactory?}}) ?
Or let each TokenFilterFactory be responsible for its own arg parsing? In which
case, we'll just fix this flag, and possible add the format input to KeepWord
if it needs one (was it intentionally skipped?).
> CommonGramsFilterFactory doesn't respect ignoreCase=true when default
> stopwords are used
> ----------------------------------------------------------------------------------------
>
> Key: LUCENE-10008
> URL: https://issues.apache.org/jira/browse/LUCENE-10008
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Chris M. Hostetter
> Priority: Major
>
> CommonGramsFilterFactory's use of the "words" and "ignoreCase" config options
> is inconsistent with how StopFilterFactory uses them - leading to
> "ignoreCase=true" not being respected unless "words" is specified...
> StopFilterFactory...
> {code:java}
> public void inform(ResourceLoader loader) throws IOException {
> if (stopWordFiles != null) {
> ...
> } else {
> ...
> stopWords = new CharArraySet(EnglishAnalyzer.ENGLISH_STOP_WORDS_SET,
> ignoreCase);
> }
> }
> {code}
> CommonGramsFilterFactory...
> {code:java}
> @Override
> public void inform(ResourceLoader loader) throws IOException {
> if (commonWordFiles != null) {
> ...
> } else {
> commonWords = EnglishAnalyzer.ENGLISH_STOP_WORDS_SET;
> }
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]