having Solr deduplication and partial update

2014-10-14 Thread Ali Nazemian
Hi, I was wondering how can I have both solr deduplication and partial update. I found out that due to some reasons you can not rely on solr deduplication when you try to update a document partially! It seems that when you do partial update on some field- even if that field does not consider as

Solr Deduplication use of overWriteDupes flag

2014-02-04 Thread Amit Agrawal
Hello, I had a configuration where I had "overwriteDupes"=false. I added few duplicate documents. Result: I got duplicate documents in the index. When I changed to "overwriteDupes"=true, the duplicate documents started overwriting the older documents. Question 1: How do I achieve, [add if not th

Re: null pointer error with solr deduplication

2012-04-23 Thread Peter Markey
erministic. > > So solr bahaves as it should :) _unexpectidly_ > > But I agree in that sence that there must be no error especially such as > NPE. > > Best Regards > Alexander Aristov > > > On 21 April 2012 03:42, Peter Markey wrote: > > > Hello, > >

Re: null pointer error with solr deduplication

2012-04-23 Thread Mark Miller
he behavior may be non-deterministic. > > So solr bahaves as it should :) _unexpectidly_ > > But I agree in that sence that there must be no error especially such as > NPE. > > Best Regards > Alexander Aristov > > > On 21 April 2012 03:42, Peter Markey wrote: > &g

Re: null pointer error with solr deduplication

2012-04-21 Thread Alexander Aristov
e in that sence that there must be no error especially such as NPE. Best Regards Alexander Aristov On 21 April 2012 03:42, Peter Markey wrote: > Hello, > > I have been trying out deduplication in solr by following: > http://wiki.apache.org/solr/Deduplication. I have defined a sign

null pointer error with solr deduplication

2012-04-20 Thread Peter Markey
Hello, I have been trying out deduplication in solr by following: http://wiki.apache.org/solr/Deduplication. I have defined a signature field to hold the values of the signature created based on few other fields in a document and the idea seems to work like a charm in a single solr instance. But

Re: Question about http://wiki.apache.org/solr/Deduplication

2011-04-04 Thread eks dev
Thanks Hoss, Externanlizing this part is exactly the path we are exploring now, not only for this reason. We already started testing Hadoop SequenceFile for write ahead log for updates/deletes. SequenceFile supports append now (simply great!). It was a a pain to have to add hadoop into mix for "

Re: Question about http://wiki.apache.org/solr/Deduplication

2011-04-02 Thread Chris Hostetter
: Is it possible in solr to have multivalued "id"? Or I need to make my : own "mv_ID" for this? Any ideas how to achieve this efficiently? This isn't something the SignatureUpdateProcessor is going to be able to hel pyou with -- it does the deduplication be changing hte low level "update" (impl

Question about http://wiki.apache.org/solr/Deduplication

2011-03-24 Thread eks dev
Hi, Use case I am trying to figure out is about preserving IDs without re-indexing on duplicate, rather adding this new ID under list of document id "aliases". Example: Input collection: "id":1, "text":"dummy text 1", "signature":"A" "id":2, "text":"dummy text 1", "signature":"A" I add the first

Re: SOLR deduplication

2011-01-26 Thread Markus Jelsma
Not right now: https://issues.apache.org/jira/browse/SOLR-1909 > Hi - I have the SOLR deduplication configured and working well. > > Is there any way I can tell which documents have been not added to the > index as a result of the deduplication rejecting subsequent identical

SOLR deduplication

2011-01-26 Thread Jason Brown
Hi - I have the SOLR deduplication configured and working well. Is there any way I can tell which documents have been not added to the index as a result of the deduplication rejecting subsequent identical documents? Many Thanks Jason Brown. If you wish to view the St. James's Place

RE: Re: Solr Deduplication and Field Collpasing

2010-09-28 Thread Markus Jelsma
 Correction, Java heap size should be RAM buffer size if i'm not too mistaken.   -Original message- From: Markus Jelsma Sent: Wed 29-09-2010 01:17 To: solr-user@lucene.apache.org; Subject: RE: Re: Solr Deduplication and Field Collpasing If you can set the digest field for your

RE: Re: Solr Deduplication and Field Collpasing

2010-09-28 Thread Markus Jelsma
00:57 To: solr-user@lucene.apache.org; Subject: Re: Solr Deduplication and Field Collpasing I have the digest field already in the schema because the index is shared between nutch docs and others.  I do not know if the second approach is the quickest in my case. I can set the digest value to something

Re: Solr Deduplication and Field Collpasing

2010-09-28 Thread Nemani, Raj
date the digest field with the value from the corresponding I'd field using solr? Thanks Raj - Original Message - From: Markus Jelsma To: solr-user@lucene.apache.org Sent: Tue Sep 28 18:19:17 2010 Subject: RE: Solr Deduplication and Field Collpasing You could create a custom update p

RE: Solr Deduplication and Field Collpasing

2010-09-28 Thread Markus Jelsma
mani, Raj Sent: Tue 28-09-2010 23:28 To: solr-user@lucene.apache.org; Subject: Solr Deduplication and Field Collpasing All, I have setup Nutch to submit the crawl results to Solr index.  I have some duplicates in the documents generated by the Nutch crawl.  There is filed 'digest' tha

Solr Deduplication and Field Collpasing

2010-09-28 Thread Nemani, Raj
All, I have setup Nutch to submit the crawl results to Solr index. I have some duplicates in the documents generated by the Nutch crawl. There is filed 'digest' that Nutch generates that is same for those documents that are duplicates. While setting up the the dedupe processor in the Solr co