Re: Finding near duplicates which searching Documents

2009-09-24 Thread Grant Ingersoll
On Sep 23, 2009, at 2:55 PM, Jason Rutherglen wrote: I think don't this handle near duplicates which would require some of the methods mentioned recently on the Mahout list. It's pluggable and I believe the TextProfileSignature is a fuzzy implementation in Solr that was brought over from Nu

Re: Finding near duplicates which searching Documents

2009-09-23 Thread Jason Rutherglen
I think don't this handle near duplicates which would require some of the methods mentioned recently on the Mahout list. On Wed, Sep 23, 2009 at 2:59 AM, Shalin Shekhar Mangar wrote: > On Wed, Sep 23, 2009 at 3:14 PM, Ninad Raut wrote: > >> Hi, >> When we have news content crawled we face a probl

Re: Finding near duplicates which searching Documents

2009-09-23 Thread Shalin Shekhar Mangar
On Wed, Sep 23, 2009 at 3:50 PM, Ninad Raut wrote: > Is this feature included in SOLR 1.4?? > Yep. -- Regards, Shalin Shekhar Mangar.

Re: Finding near duplicates which searching Documents

2009-09-23 Thread Ninad Raut
Is this feature included in SOLR 1.4?? On Wed, Sep 23, 2009 at 3:29 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Wed, Sep 23, 2009 at 3:14 PM, Ninad Raut >wrote: > > > Hi, > > When we have news content crawled we face a problme of same content being > > repeated in many docume

Re: Finding near duplicates which searching Documents

2009-09-23 Thread Shalin Shekhar Mangar
On Wed, Sep 23, 2009 at 3:14 PM, Ninad Raut wrote: > Hi, > When we have news content crawled we face a problme of same content being > repeated in many documents. We want to add a near duplicate document > filter > to detect such documents. Is there a way to do that in SOLR? > Look at http://wik

Finding near duplicates which searching Documents

2009-09-23 Thread Ninad Raut
Hi, When we have news content crawled we face a problme of same content being repeated in many documents. We want to add a near duplicate document filter to detect such documents. Is there a way to do that in SOLR? Regards, Ninad Raut.