Michael Froh created LUCENE-9567:
------------------------------------
Summary: JapanesePartOfSpeechStopFilterFactory should load
built-in stop tags by default
Key: LUCENE-9567
URL: https://issues.apache.org/jira/browse/LUCENE-9567
Project: Lucene - Core
Issue Type: Improvement
Components: modules/analysis
Affects Versions: 8.6
Reporter: Michael Froh
If JapanesePartOfSpeechStopFilterFactory is given empty args, it does nothing.
It doesn't load any stop tags, and just passes along the TokenStream passed to
create().
As a default behavior, this is trappy, since a user may add the filter without
explicitly adding any arguments and assume that it would load a "default" stop
set. Or they may assume that if an explicit argument is required then an
exception will be thrown. Regardless, "doing nothing" is almost certainly not
what the user intended.
I'm going to attach a patch to load the default stop tags (using
{{JapaneseAnalyzer.getDefaultStopTags()}}) if no args are specified, which
probably makes sense in 9.0 (as it's consistent with e.g.
KoreanPartOfSpeechStopFilterFactory). If we want to apply a fix to 8.x, maybe
throw an exception to let the use know that the FilterFactory probably isn't
doing what they think it's doing?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]