Michael Froh created LUCENE-9567:
------------------------------------

             Summary: JapanesePartOfSpeechStopFilterFactory should load 
built-in stop tags by default
                 Key: LUCENE-9567
                 URL: https://issues.apache.org/jira/browse/LUCENE-9567
             Project: Lucene - Core
          Issue Type: Improvement
          Components: modules/analysis
    Affects Versions: 8.6
            Reporter: Michael Froh


If JapanesePartOfSpeechStopFilterFactory is given empty args, it does nothing. 
It doesn't load any stop tags, and just passes along the TokenStream passed to 
create().

As a default behavior, this is trappy, since a user may add the filter without 
explicitly adding any arguments and assume that it would load a "default" stop 
set. Or they may assume that if an explicit argument is required then an 
exception will be thrown. Regardless, "doing nothing" is almost certainly not 
what the user intended.

I'm going to attach a patch to load the default stop tags (using 
{{JapaneseAnalyzer.getDefaultStopTags()}}) if no args are specified, which 
probably makes sense in 9.0 (as it's consistent with e.g. 
KoreanPartOfSpeechStopFilterFactory). If we want to apply a fix to 8.x, maybe 
throw an exception to let the use know that the FilterFactory probably isn't 
doing what they think it's doing?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to