The problem here is defining "irrelevant". There's nothing in Solr
that magically can determine "this term is irrelevant in this doc, but
this other one isn't".
Best,
Erick
On Sat, Apr 23, 2016 at 11:08 AM, GW wrote:
> No. My project is retail based. I mean people putting in a slew of
> irreleva
No. My project is retail based. I mean people putting in a slew of
irrelevant keywords in addition to relevant keywords in an attempt to get
hits on searches and hits outside of context.
I used a filter factory to remove duplicates.
On 23 April 2016 at 11:30, Doug Turnbull <
dturnb...@opensourcec
By keyword spamming, do you mean stuffing the same term over and over to
game term frequency?
If so You might want to try tuning BM25 similarity for your needs. It has a
saturation point for term frequency.
http://opensourceconnections.com/blog/2015/10/16/bm25-the-next-generation-of-lucene-releva
Hey all,
I'm just finishing up a project and I'm hoping for some direction on
dealing with keyword spamming.
I don't have any urgent issues. I can foresee some bumps in the road.
I'm using a custom spider that pulls inventory data from several dozen
sources into a single doc schema. 1 record per