Re: Stemmer Question

Jamie Johnson Thu, 08 Mar 2012 19:59:14 -0800

I'd be very interested to see how you did this if it is available. Does
this seem like something useful to the community at large?


On Thursday, March 8, 2012, Ahmet Arslan <iori...@yahoo.com> wrote:
>> Thanks the KeywordMarkerFilterFactory
>> seems to be what I was looking
>> for.  I'm still wondering about keeping the unstemmed
>> word as a token
>> though.  While I know that this would increase the
>> index size slightly
>> I wonder what the negative of doing such a thing would
>> be?  Just seems
>> less destructive since I always store the unstemmed version
>> and the
>> stemmed version.  By not storing the unstemmed version
>> there is no way
>> to go back without reindexing. If I wanted to implement this
>> I'm
>> assuming a custom tokenizer would be most appropriate?
>> Does something
>> like this already exist?
>
> Not out-of-the-box. Actually I was using your idea, implemented such
custom token filter by mixing synonym filter and stem filter. This is
useful for wildcard queries. And for normal queries, this could rank exact
matches higher.
>

Re: Stemmer Question

Reply via email to