Re: need help with keyword spamming

2016-04-23 Thread Erick Erickson
The problem here is defining "irrelevant". There's nothing in Solr that magically can determine "this term is irrelevant in this doc, but this other one isn't". Best, Erick On Sat, Apr 23, 2016 at 11:08 AM, GW wrote: > No. My project is retail based. I mean people putting in a slew of > irreleva

Re: need help with keyword spamming

2016-04-23 Thread GW
No. My project is retail based. I mean people putting in a slew of irrelevant keywords in addition to relevant keywords in an attempt to get hits on searches and hits outside of context. I used a filter factory to remove duplicates. On 23 April 2016 at 11:30, Doug Turnbull < dturnb...@opensourcec

Re: need help with keyword spamming

2016-04-23 Thread Doug Turnbull
By keyword spamming, do you mean stuffing the same term over and over to game term frequency? If so You might want to try tuning BM25 similarity for your needs. It has a saturation point for term frequency. http://opensourceconnections.com/blog/2015/10/16/bm25-the-next-generation-of-lucene-releva