Correction, Java heap size should be RAM buffer size if i'm not too mistaken.
-Original message-
From: Markus Jelsma
Sent: Wed 29-09-2010 01:17
To: solr-user@lucene.apache.org;
Subject: RE: Re: Solr Deduplication and Field Collpasing
If you can set the digest field for your
00:57
To: solr-user@lucene.apache.org;
Subject: Re: Solr Deduplication and Field Collpasing
I have the digest field already in the schema because the index is shared
between nutch docs and others. I do not know if the second approach is the
quickest in my case.
I can set the digest value to something
date the digest field with the value from the corresponding I'd
field using solr?
Thanks
Raj
- Original Message -
From: Markus Jelsma
To: solr-user@lucene.apache.org
Sent: Tue Sep 28 18:19:17 2010
Subject: RE: Solr Deduplication and Field Collpasing
You could create a custom update p
You could create a custom update processor that adds a digest field for newly
added documents that do not have the digest field themselves. This way, the
documents that are not added by Nutch get a proper non-empty digest field so
the deduplication processor won't create the same empty hash and