Hi,
I was wondering how can I have both solr deduplication and partial update.
I found out that due to some reasons you can not rely on solr deduplication
when you try to update a document partially! It seems that when you do
partial update on some field- even if that field does not consider as
Hello,
I had a configuration where I had "overwriteDupes"=false. I added few
duplicate documents. Result: I got duplicate documents in the index.
When I changed to "overwriteDupes"=true, the duplicate documents started
overwriting the older documents.
Question 1: How do I achieve, [add if not th
erministic.
>
> So solr bahaves as it should :) _unexpectidly_
>
> But I agree in that sence that there must be no error especially such as
> NPE.
>
> Best Regards
> Alexander Aristov
>
>
> On 21 April 2012 03:42, Peter Markey wrote:
>
> > Hello,
> >
he behavior may be non-deterministic.
>
> So solr bahaves as it should :) _unexpectidly_
>
> But I agree in that sence that there must be no error especially such as
> NPE.
>
> Best Regards
> Alexander Aristov
>
>
> On 21 April 2012 03:42, Peter Markey wrote:
>
&g
e in that sence that there must be no error especially such as
NPE.
Best Regards
Alexander Aristov
On 21 April 2012 03:42, Peter Markey wrote:
> Hello,
>
> I have been trying out deduplication in solr by following:
> http://wiki.apache.org/solr/Deduplication. I have defined a sign
Hello,
I have been trying out deduplication in solr by following:
http://wiki.apache.org/solr/Deduplication. I have defined a signature field
to hold the values of the signature created based on few other fields in a
document and the idea seems to work like a charm in a single solr instance.
But
Thanks Hoss,
Externanlizing this part is exactly the path we are exploring now, not
only for this reason.
We already started testing Hadoop SequenceFile for write ahead log for
updates/deletes.
SequenceFile supports append now (simply great!). It was a a pain to
have to add hadoop into mix for "
: Is it possible in solr to have multivalued "id"? Or I need to make my
: own "mv_ID" for this? Any ideas how to achieve this efficiently?
This isn't something the SignatureUpdateProcessor is going to be able to
hel pyou with -- it does the deduplication be changing hte low level
"update" (impl
Hi,
Use case I am trying to figure out is about preserving IDs without
re-indexing on duplicate, rather adding this new ID under list of
document id "aliases".
Example:
Input collection:
"id":1, "text":"dummy text 1", "signature":"A"
"id":2, "text":"dummy text 1", "signature":"A"
I add the first
Not right now:
https://issues.apache.org/jira/browse/SOLR-1909
> Hi - I have the SOLR deduplication configured and working well.
>
> Is there any way I can tell which documents have been not added to the
> index as a result of the deduplication rejecting subsequent identical
Hi - I have the SOLR deduplication configured and working well.
Is there any way I can tell which documents have been not added to the index as
a result of the deduplication rejecting subsequent identical documents?
Many Thanks
Jason Brown.
If you wish to view the St. James's Place
Correction, Java heap size should be RAM buffer size if i'm not too mistaken.
-Original message-
From: Markus Jelsma
Sent: Wed 29-09-2010 01:17
To: solr-user@lucene.apache.org;
Subject: RE: Re: Solr Deduplication and Field Collpasing
If you can set the digest field for your
00:57
To: solr-user@lucene.apache.org;
Subject: Re: Solr Deduplication and Field Collpasing
I have the digest field already in the schema because the index is shared
between nutch docs and others. I do not know if the second approach is the
quickest in my case.
I can set the digest value to something
date the digest field with the value from the corresponding I'd
field using solr?
Thanks
Raj
- Original Message -
From: Markus Jelsma
To: solr-user@lucene.apache.org
Sent: Tue Sep 28 18:19:17 2010
Subject: RE: Solr Deduplication and Field Collpasing
You could create a custom update p
mani, Raj
Sent: Tue 28-09-2010 23:28
To: solr-user@lucene.apache.org;
Subject: Solr Deduplication and Field Collpasing
All,
I have setup Nutch to submit the crawl results to Solr index. I have
some duplicates in the documents generated by the Nutch crawl. There is
filed 'digest' tha
All,
I have setup Nutch to submit the crawl results to Solr index. I have
some duplicates in the documents generated by the Nutch crawl. There is
filed 'digest' that Nutch generates that is same for those documents
that are duplicates. While setting up the the dedupe processor in the
Solr co
16 matches
Mail list logo