To answer my own post about the subtle difference between the shard and
replicate examples, it looks like the difference is in the numShards
parameter.

If you define numShards to be = 2, and then creating more shards than 2
will give you replicates.  Is that correct?

If that is the case, I think that my settings are correct.   I still do not
explain why I have such growth on all the shards at the same time.

One thing I noticed is that three of them are leaders in the SolrCloud
admin UI graph.  Is that normal?


Thierry




On Mon, Aug 12, 2013 at 5:39 PM, Thierry Thelliez <
thierry.thelliez.t...@gmail.com> wrote:

>
> Thanks Shawn for the detailed instructions.
>
> About the router: it is implicit.
>
> About the replicas: I followed the example at
> http://wiki.apache.org/solr/SolrCloud
>
> I start the shards with the following (paths and ports simplified):
>
> cd /.../solr/shard1/
> /usr/bin/java -Djetty.port=1 -Dbootstrap_confdir=./solr/collection1/conf
> -Dcollection.configName=myconf -DzkRun=localhost:0 -DnumShards=4 -jar
> start.jar > /.../log/shard_1.log
>
> cd /.../solr/shard2/
> /usr/bin/java -Djetty.port=2 -DzkHost=localhost:0 -jar start.jar >
> /.../log/shard_2.log
>
> and same thing for the two other shards on their own ports.
>
>
> To post a document (CSV file), I use:
>
> curl http://localhost:shardport/solr/update --data-binary file.csv
> -H 'Content-type:text/csv; charset=ISO-8859-1'
>
>
> I just re-read the example page  at http://wiki.apache.org/solr/SolrCloud   
> and I see that there is no difference between starting a shard or a
> replicate.  I must be missing something:
>
> From exampleA (two shards):
>
> cd example2
>
> java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar
>
> Fomr exampleB (two shards with replicates):
>
> cd exampleB
>
> java -Djetty.port=8900 -DzkHost=localhost:9983 -jar start.jar
>
> Thanks.
> Thierry
>
>
>
>
>
>
>
>
>
>
> On Mon, Aug 12, 2013 at 5:04 PM, Shawn Heisey <s...@elyograg.org> wrote:
>
>> On 8/12/2013 4:50 PM, Thierry Thelliez wrote:
>>
>>> Hello,  I am trying to set a four shard system for the first time.  I do
>>> not understand why all the shards data are growing at about the same rate
>>> when I push the documents to only one shard.
>>>
>>> The four shards represent four calendar years.  And for now, on a
>>> development machine, these four shards run on four different ports.
>>>
>>> The first shard is started with Zookeeper.
>>>
>>> The log of the other shards is filed with something like:
>>>
>>> 7882051 [qtp1154079020-1245] INFO
>>> org.apache.solr.update.**processor.LogUpdateProcessor – [collection1]
>>> webapp=/solr path=/update params={distrib.from=
>>> http://x.y.z.4:50121/solr/**collection1/&update.distrib=**
>>> TOLEADER&wt=javabin&version=2<http://x.y.z.4:50121/solr/collection1/&update.distrib=TOLEADER&wt=javabin&version=2>
>>> }
>>> {add=[14939-96467-304 (1443204912169091072), 14939-96467-308
>>> (1443204912179576832), 14939-96467-310 (1443204912185868288),
>>> 14939-96467-311 (1443204912192159744), 14939-96467-313
>>> (1443204912204742656), 14939-96467-314 (1443204912220471296),
>>> 14939-96467-318 (1443204912239345664), 14939-96467-319
>>> (1443204912250880000), 14939-96467-322 (1443204912257171456),
>>> 14939-96467-324 (1443204912263462912)]} 0 282
>>>
>>> What is getting written to the other shards? Is a separate index computed
>>> on all four shards?  I thought that when pushing a document to one shard,
>>> only that shard would update its index.
>>>
>>
>> There are two possibilities.
>>
>> 1) You don't have four shards, you have four replicas of one shard.  If
>> this is happening, then they all will receive all documents.
>>
>> 2) You are using a router like compositeId instead of implicit.  This
>> will calculate the hash of the id field and evenly divide the documents
>> among all the shards in the collection according to the hash value.  If you
>> create the collection with the implicit router, then documents should be
>> indexed by the shard that received them.
>>
>> To see what router you have, click on Cloud in the admin UI, then click
>> on Tree.  Click the arrow to the left of '/collections' to open it. Click
>> on collection1 (or whichever you are actually using) -- the actual name,
>> not the arrow.  Underneath the table that appears to the right will be
>> "router" and its value.
>>
>> Thanks,
>> Shawn
>>
>>
>

Reply via email to