Re: [PR] skip keyword in German Normalization Filter [lucene]

via GitHub Thu, 27 Mar 2025 00:15:19 -0700


xzhang9292 commented on PR #14416:
URL: https://github.com/apache/lucene/pull/14416#issuecomment-2756971573


   > This keyword is legacy, for stemmers not normalizers. Just use 
ProtectedTermFilter which works with any tokenfilter without requiring 
modification to its code?
   
   @rmuir Thank you for your comment. Yes, using ProtectedTermFilter could be 
an option when directly using German Normalization Filter. Our problem comes 
when we are using German Analyzer. The GermanNormalizer positioned before 
GermanStemmers, so even we put word like "Bär" in exclusion set, normalizer can 
still change it to "bar". I think when people put a word in exclusion set, they 
probably also want the normalizer to keep the word. Would it be beneficial that 
we let Normalizer also exclude keywords?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] skip keyword in German Normalization Filter [lucene]

Reply via email to