I thought I had sent this reply over the weekend.  I had it all ready to
go, but it's still here waiting in my Drafts folder, so I'll send it now.

On 3/25/2016 11:05 AM, Victor D'agostino wrote:
> I am trying to set up a Solr Cloud environment of two Solr 5.4.1 nodes
> but the data are always indexed on the first node although the unique
> id is a GUID.
>
> It looks like I can't add an additional node. Could you tell me where
> i'm wrong ?
>
> I try to set up a collection named "db" with two shards on each node.
> Without replica. The config is named "copiemail3".

<snip>

> On node n°2
> I start Solr and create the two shards with the cores API (collections
> API won't work because i use compositeId routing mode) :
>  wget --no-proxy
> "http://$HOSTNAME:8983/solr/admin/cores?action=CREATE&schema=schema.xml&shard=shard3&instanceDir=db_shard3_replica1&indexInfo=false&name=db_shard3_replica1&config=solrconfig.xml&collection=db&dataDir=data";
>  wget --no-proxy
> "http://$HOSTNAME:8983/solr/admin/cores?action=CREATE&schema=schema.xml&shard=shard4&instanceDir=db_shard4_replica1&indexInfo=false&name=db_shard4_replica1&config=solrconfig.xml&collection=db&dataDir=data";
> Like node 1 i activate the ping and restart Solr.

This is why it's a VERY bad idea to use CoreAdmin in cloud mode unless
you understand *EXACTLY* what you are doing and how SolrCloud functions
internally.  There's no polite way to tell you that you don't have this
expert-level understanding.

The CoreAdmin calls that you executed have added two new shards to your
collection.  This might be what you intended, but as you have
discovered, the true effects are not what you *wanted*.

Your interaction with SolrCloud collections should always be through the
Collections API.  Any other method may not work as expected.

When you first create your compositeId-routed collection, you need to
tell Solr exactly what you want (number of shards, number of replicas). 
If you had used replicationFactor=2, then your second node would have
had replicas of both shards from the beginning.  You can add replicas
later with the ADDREPLICA action on the Collections API.

The implicit router means 100% manual routing, and you probably do NOT
want that.  A collection using implicit routing is one that lets you add
shards with no problems.  This is because indexing to such a collection
requires that you choose which shard will receive every indexing request
-- nothing will be automatically routed.

If you want Solr to automatically handle shard routing (compositeId) you
can't just add shards to your collection and expect them to be used. 
This is why the collections API refuses to add shards when you're using
compositeId.

The shard routing and the number of total shards for compositeId is
established when the collection is created, and can only be changed by
splitting shards (a Collections API action) or manually changing the
hash ranges in the clusterstate in zookeeper.  Manual clusterstate
editing is only recommended as a *last* resort for fixing a completely
broken collection.  In normal situations even *experts* shouldn't edit
the clusterstate.  It's extremely easy to break SolrCloud with these edits.

Thanks,
Shawn

Reply via email to