Re: Stemming with SOLR

2016-12-18 Thread Lasitha Wattaladeniya
Thank you all for the replies. I am considering the suggestions On 17 Dec 2016 01:50, "Susheel Kumar" wrote: > To handle irregular nouns ( > http://www.ef.com/english-resources/english-grammar/ > singular-and-plural-nouns/), > the simplest way is handle them using StemOverriderFactory. The lis

Re: Stemming with SOLR

2016-12-16 Thread Susheel Kumar
To handle irregular nouns ( http://www.ef.com/english-resources/english-grammar/singular-and-plural-nouns/), the simplest way is handle them using StemOverriderFactory. The list is not so long. Or otherwise go for commercial solutions like basistech etc. as Alex suggested oR you can customize Hun

Re: Stemming with SOLR

2016-12-15 Thread Alexandre Rafalovitch
If you need the full fidelity solution taking care of multiple edge-cases, it could be worth looking at commercial solutions. http://www.basistech.com/ has one, including a free-level SAAS plan. Regards, Alex. http://www.solr-start.com/ - Resources for Solr users, new and experienced O

Re: Stemming with SOLR

2016-12-15 Thread Lasitha Wattaladeniya
Hi all, Thanks for the replies, @eric, ahmet : since those stemmers are logical stemmers it won't work on words such as caught, ran and so on. So in our case it won't work @susheel : Yes I thought about it but problems we have is, the documents we index are some what large text, so copy fielding

Re: Stemming with SOLR

2016-12-15 Thread Susheel Kumar
We did extensive comparison in the past for Snowball, KStem and Hunspell and there are cases where one of them works better but not other or vice-versa. You may utilise all three of them by having 3 different fields (fieldTypes) and during query, search in all of them. For some of the cases where

Re: Stemming with SOLR

2016-12-15 Thread Ahmet Arslan
Hi, KStemFilter returns legitimate English words, please use it. Ahmet On Thursday, December 15, 2016 6:17 PM, Lasitha Wattaladeniya wrote: Hello devs, I'm trying to develop this indexing and querying flow where it converts the words to its original form (lemmatization). I was doing bit of

Re: Stemming with SOLR

2016-12-15 Thread Erick Erickson
What about things like PorterStemFilterFactory, EnglishMinimalStemFilterFactory and the like? Best, Erick On Thu, Dec 15, 2016 at 7:16 AM, Lasitha Wattaladeniya wrote: > Hello devs, > > I'm trying to develop this indexing and querying flow where it converts the > words to its original form (lemm

Stemming with SOLR

2016-12-15 Thread Lasitha Wattaladeniya
Hello devs, I'm trying to develop this indexing and querying flow where it converts the words to its original form (lemmatization). I was doing bit of research lately but the information on the internet is very limited. I tried using hunspellfactory but it doesn't convert the word to it's original