Copying a SolrCloud collection to other hosts

2018-03-06 Thread Patrick Schemitz
Hi List,

so I'm running a bunch of SolrCloud clusters (each cluster is: 8 shards
on 2 servers, with 4 instances per server, no replicas, i.e. 1 shard per
instance).

Building the index afresh takes 15+ hours, so when I have to deploy a new
index, I build it once, on one cluster, and then copy (scp) over the
data//index directories (shutting down the Solr instances first).

I could get Solr 6.5.1 to number the shard/replica directories nicely via
the createNodeSet and createNodeSet.shuffle options:

Solr 6.5.1 /var/lib/solr:

Server node 1:
instance00/data/main_index_shard1_replica1
instance01/data/main_index_shard2_replica1
instance02/data/main_index_shard3_replica1
instance03/data/main_index_shard4_replica1

Server node 2:
instance00/data/main_index_shard5_replica1
instance01/data/main_index_shard6_replica1
instance02/data/main_index_shard7_replica1
instance03/data/main_index_shard8_replica1

However, while attempting to upgrade to 7.2.1, this numbering has changed:

Solr 7.2.1 /var/lib/solr:

Server node 1:
instance00/data/main_index_shard1_replica_n1
instance01/data/main_index_shard2_replica_n2
instance02/data/main_index_shard3_replica_n4
instance03/data/main_index_shard4_replica_n6

Server node 2:
instance00/data/main_index_shard5_replica_n8
instance01/data/main_index_shard6_replica_n10
instance02/data/main_index_shard7_replica_n12
instance03/data/main_index_shard8_replica_n14

This new numbering breaks my copy script, and furthermode, I'm worried
as to what happens when the numbering is different among target clusters.

How can I switch this back to the old numbering scheme?

Side note: is there a recommended way of doing this? Is the
backup/restore mechanism suitable for this? The ref guide is kind of terse
here.

Thanks in advance,

Ciao, Patrick


Re: Copying a SolrCloud collection to other hosts

2018-03-15 Thread Patrick Schemitz
Hi Erick,

thanks a lot, that solved our problem nicely.

(It took us a try or two to notice that this will not copy the entire
collection but only the shard on the source instance, and we need to do
this for all instances explicitly. But hey, we had to do the same for
the old approch of scp'ing the data directories.)

Ciao, Patrick

On Tue, Mar 06, 2018 at 07:18:15AM -0800, Erick Erickson wrote:
> this is part of the "different replica types" capability, there are
> NRT (the only type available prior to 7x), PULL and TLOG which would
> have different names. I don't know of any way to switch it off.
> 
> As far as moving the data, here's a little known trick: Use the
> replication API to issue a fetchindexk, see:
> https://lucene.apache.org/solr/guide/6_6/index-replication.html As
> long as the target cluster can "see" the source cluster via http, this
> should work. This is entirely outside SolrCloud and ZooKeeper is not
> involved. This would even work with, say, one side being stand-alone
> and the other being SolrCloud (not that you want to do that, just
> illustrating it's not part of SolrCloud)...
> 
> So you'd specify something like:
> http://target_node:port/solr/core_name/replication?command=fetchindex&masterUrl=http://source_node:port/solr/core_name
> 
> "core_name" in these cases is what appears in the "cores" dropdown on
> the admin UI page. You do not have to shut Solr down at all on either
> end to use this, although last I knew the target node would not serve
> queries while this was happening.
> 
> An alternative is to not hard-code the names in your copy script,
> rather look at the information in ZooKeeper for your source and target
> information, you could do this by using the CLUSTERSTATUS collections
> API call.
> 
> Best,
> Erick
> 
> On Tue, Mar 6, 2018 at 6:47 AM, Patrick Schemitz  wrote:
> > Hi List,
> >
> > so I'm running a bunch of SolrCloud clusters (each cluster is: 8 shards
> > on 2 servers, with 4 instances per server, no replicas, i.e. 1 shard per
> > instance).
> >
> > Building the index afresh takes 15+ hours, so when I have to deploy a new
> > index, I build it once, on one cluster, and then copy (scp) over the
> > data//index directories (shutting down the Solr instances 
> > first).
> >
> > I could get Solr 6.5.1 to number the shard/replica directories nicely via
> > the createNodeSet and createNodeSet.shuffle options:
> >
> > Solr 6.5.1 /var/lib/solr:
> >
> > Server node 1:
> > instance00/data/main_index_shard1_replica1
> > instance01/data/main_index_shard2_replica1
> > instance02/data/main_index_shard3_replica1
> > instance03/data/main_index_shard4_replica1
> >
> > Server node 2:
> > instance00/data/main_index_shard5_replica1
> > instance01/data/main_index_shard6_replica1
> > instance02/data/main_index_shard7_replica1
> > instance03/data/main_index_shard8_replica1
> >
> > However, while attempting to upgrade to 7.2.1, this numbering has changed:
> >
> > Solr 7.2.1 /var/lib/solr:
> >
> > Server node 1:
> > instance00/data/main_index_shard1_replica_n1
> > instance01/data/main_index_shard2_replica_n2
> > instance02/data/main_index_shard3_replica_n4
> > instance03/data/main_index_shard4_replica_n6
> >
> > Server node 2:
> > instance00/data/main_index_shard5_replica_n8
> > instance01/data/main_index_shard6_replica_n10
> > instance02/data/main_index_shard7_replica_n12
> > instance03/data/main_index_shard8_replica_n14
> >
> > This new numbering breaks my copy script, and furthermode, I'm worried
> > as to what happens when the numbering is different among target clusters.
> >
> > How can I switch this back to the old numbering scheme?
> >
> > Side note: is there a recommended way of doing this? Is the
> > backup/restore mechanism suitable for this? The ref guide is kind of terse
> > here.
> >
> > Thanks in advance,
> >
> > Ciao, Patrick