Hi Remi,

The filter does not support protwords but does support the KeywordAttribute. 
Use the KeywordMarkerFilter to mark a list of words and protect them from 
stemming.

http://lucene.apache.org/core/4_1_0/analyzers-common/org/apache/lucene/analysis/miscellaneous/KeywordMarkerFilter.html

Cheers,
Markus

 
 
-----Original message-----
> From:Remi Mikalsen <remi.mikal...@iktsenteret.no>
> Sent: Fri 01-Mar-2013 14:46
> To: solr-user@lucene.apache.org
> Subject: NorwegianLightStemFilterFactory and protected words
> 
> While the NorwegianLightStemFilterFactory generally works very well, I have 
> come across a few words I'd very much like not to stem.
> 
> The following words:
>  - lærere (teachers)
>  - lærer (teacher)
>  - lære (teach)
> 
> all match :
>  - lær (leather)
> 
> I tried adding protected="protwords.txt" to my 
> NorwegianLightStemFilterFactory filter, and adding the following words to my 
> protwords.txt file:
>  - lærere
>  - lærer
>  - lære
> 
> It didn't work (I use the protwords.txt for other purposes and it works 
> there). After looking around, it *seems* this particular FilterFactory 
> doesn't support protwords the same way for example 
> SnowballPorterFilterFactory does.
> 
> I wonder if there is an alternative way to stop those words from being 
> processed by the NorwegianLightStemFilterFactory? 
> 
> 
> Regards,
> 
> -- 
> Remi Mikalsen
> Senter for IKT i utdanningen
> 

Reply via email to