On Sep 23, 2009, at 2:55 PM, Jason Rutherglen wrote:
I think don't this handle near duplicates which would require some of
the methods mentioned recently on the Mahout list.
It's pluggable and I believe the TextProfileSignature is a fuzzy
implementation in Solr that was brought over from Nu
I think don't this handle near duplicates which would require some of
the methods mentioned recently on the Mahout list.
On Wed, Sep 23, 2009 at 2:59 AM, Shalin Shekhar Mangar
wrote:
> On Wed, Sep 23, 2009 at 3:14 PM, Ninad Raut wrote:
>
>> Hi,
>> When we have news content crawled we face a probl
On Wed, Sep 23, 2009 at 3:50 PM, Ninad Raut wrote:
> Is this feature included in SOLR 1.4??
>
Yep.
--
Regards,
Shalin Shekhar Mangar.
Is this feature included in SOLR 1.4??
On Wed, Sep 23, 2009 at 3:29 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:
> On Wed, Sep 23, 2009 at 3:14 PM, Ninad Raut >wrote:
>
> > Hi,
> > When we have news content crawled we face a problme of same content being
> > repeated in many docume
On Wed, Sep 23, 2009 at 3:14 PM, Ninad Raut wrote:
> Hi,
> When we have news content crawled we face a problme of same content being
> repeated in many documents. We want to add a near duplicate document
> filter
> to detect such documents. Is there a way to do that in SOLR?
>
Look at http://wik
Hi,
When we have news content crawled we face a problme of same content being
repeated in many documents. We want to add a near duplicate document filter
to detect such documents. Is there a way to do that in SOLR?
Regards,
Ninad Raut.