Re: Deduplication in 1.4

2009-11-26 Thread Martijn v Groningen
Message > >> From: Martijn v Groningen >> To: solr-user@lucene.apache.org >> Sent: Thu, November 26, 2009 3:19:40 AM >> Subject: Re: Deduplication in 1.4 >> >> Field collapsing has been used by many in their production >> environment. > > Got any po

Re: Deduplication in 1.4

2009-11-26 Thread Otis Gospodnetic
Hi Martijn, - Original Message > From: Martijn v Groningen > To: solr-user@lucene.apache.org > Sent: Thu, November 26, 2009 3:19:40 AM > Subject: Re: Deduplication in 1.4 > > Field collapsing has been used by many in their production > environment. Got any poi

Re: Deduplication in 1.4

2009-11-26 Thread Martijn v Groningen
Field collapsing has been used by many in their production environment. The last few months the stability of the patch grew as quiet some bugs were fixed. The only big feature missing currently is caching of the collapsing algorithm. I'm currently working on that and I will put it in a new patch in

Re: Deduplication in 1.4

2009-11-25 Thread KaktuChakarabati
Hey Otis, Yep, I realized this myself after playing some with the dedupe feature yesterday. So it does look like Field collapsing is what I need pretty much. Any idea on how close it is to being production-ready? Thanks, -Chak Otis Gospodnetic wrote: > > Hi, > > As far as I know, the point of

Re: Deduplication in 1.4

2009-11-24 Thread Otis Gospodnetic
Hi, As far as I know, the point of deduplication in Solr ( http://wiki.apache.org/solr/Deduplication ) is to detect a duplicate document before indexing it in order to avoid duplicates in the index in the first place. What you are describing is closer to field collapsing patch in SOLR-236. Ot