subject:"Duplicated Documents Across shards"

Re: Duplicated Documents Across shards

2013-05-06 Thread Shawn Heisey

> Oops... you're right, and before I started writing that response I had the > thought that these should be "shardDir", but even that is confused. I > think > "replicaDir" or "collectionReplica" or "shardReplicaDir" or... > "collectionShardReplicaDir" - the latter is wordy, but is explicit. I'd > r

Re: Duplicated Documents Across shards

2013-05-06 Thread Jack Krupansky

rm. Even a "single core" Solr is using a "collection" (that happens to be single-core and single-shard and single-replica.) To wit, the stock Solr example, which is not SolrCloud, is named "collection1". -- Jack Krupansky -Original Message- From: Shawn

Re: Duplicated Documents Across shards

2013-05-06 Thread Shawn Heisey

On 5/6/2013 7:44 AM, Jack Krupansky wrote: > I think if we had a more compehensible term for a "collection > configuration directory", a lot of the confusion would go away. I mean, > what the heck is an "instance" anyway? How does "instanceDir" relate to > an "instance" of the Solr "server"? Sure,

Re: Duplicated Documents Across shards

2013-05-06 Thread Jack Krupansky

think it's the same for all cores in a Solr "instance". We should reconsider the name of that term. My choice: collectionDir. -- Jack Krupansky -Original Message- From: Erick Erickson Sent: Monday, May 06, 2013 7:39 AM To: solr-user@lucene.apache.org Subject: Re: D

Re: Duplicated Documents Across shards

2013-05-06 Thread Iker Mtnz. Apellaniz

Thank you very Much Erick, That was the real problem, we had two cores sharing the same folder and core_name. Here is the definitive version of the solr.xml. Tested and correctly working Thanks everybody Iker 2013/5/6 Erick Erickson > Having multiple cores point to the same index is, exc

Re: Duplicated Documents Across shards

2013-05-06 Thread Erick Erickson

Having multiple cores point to the same index is, except for special circumstances where one of the cores is guaranteed to be read only, a Bad Thing. So it sounds like you've found your issue... Best Erick On Mon, May 6, 2013 at 4:44 AM, Iker Mtnz. Apellaniz wrote: > Thanks Erick, > I think w

Re: Duplicated Documents Across shards

2013-05-06 Thread Iker Mtnz. Apellaniz

Thanks Erick, I think we found the problem. When defining the cores for both shards we define both of them in the same instanceDir, like this: Each shard should have its own folder, so the final configuration should be like this: Can anyone confirm this? Thanks, Iker 2013/5/4 Erick E

Re: Duplicated Documents Across shards

2013-05-04 Thread Erick Erickson

Sounds like you've explicitly routed the same document to two different shards. Document replacement only happens locally to a shard, so the fact that you have documents with the same ID on two different shards is why you're getting duplicate documents. Best Erick On Fri, May 3, 2013 at 3:44 PM,

Re: Duplicated Documents Across shards

2013-05-03 Thread Iker Mtnz. Apellaniz

We are currently using version 4.2. We have made tests with a single document and it gives us a 2 document count. But if we force to shard into te first machine, the one with a unique shard, the count gives us 1 document. I've tried using distrib=false parameter, it gives us no duplicate documents,

Re: Duplicated Documents Across shards

2013-05-03 Thread Erick Erickson

What version of Solr? The custom routing stuff is quite new so I'm guessing 4x? But this shouldn't be happening. The actual index data for the shards should be in separate directories, they just happen to be on the same physical machine. Try querying each one with &distrib=false to see the counts

Duplicated Documents Across shards

2013-05-03 Thread Iker Mtnz. Apellaniz

Hi, We have currently a solrCloud implementation running 5 shards in 3 physical machines, so the first machine will have the shard number 1, the second machine shards 2 & 4, and the third shards 3 & 5. We noticed that while queryng numFoundDocs decreased when we increased the start param. After

Re: Duplicated Documents Across shards

Re: Duplicated Documents Across shards

Re: Duplicated Documents Across shards

Re: Duplicated Documents Across shards

Re: Duplicated Documents Across shards

Re: Duplicated Documents Across shards

Re: Duplicated Documents Across shards

Re: Duplicated Documents Across shards

Re: Duplicated Documents Across shards

Re: Duplicated Documents Across shards

Duplicated Documents Across shards

11 matches

Site Navigation

Mail list logo

Footer information