Yes, it looks like you're on track, good luck!
On Mon, Oct 23, 2017 at 5:21 PM, Marko Babic <babma...@abebooks.com> wrote: > Thanks for the quick reply, Erick. > > To follow up: > > “ > Well, first you can explicitly set legacyCloud=true by using the > Collections API CLUSTERPROP command. I don't recommend this, mind you, > as legacyCloud will not be supported forever. > “ > > Yes, but like you say: we’ll have to deal with at some point, not much > benefit in punting. > > “ > I'm not following something here though. When you say: > "The desired final state of a such a deployment is a fully configured > cluster ready to accept updates." > are there any documents already in the index or is this really a new > collection? > “ > > It’s a brand new collection with new configuration on fresh hardware which > we’ll then fully index from a source document store (we do this when we have > certain schema changes that require re-indexing or we want to experiment). > > “ > Not sure what you mean here. Configuration of what? Just spinning up > a Solr node pointing to the right ZooKeeper should be sufficient, or > I'm not understanding at all. > “ > > Apologies, the way I stated that was all wrong: by “requires configuration” I > just meant to note the need to specify a shard and a node when adding a > replica (and not even the node as you point out to me below ☺). > > “ > I suspect you're really talking about the "node" parameter > to ADDREPLCIA > “ > > Ah, yes: that is what I meant, sorry. > > It sounds like I haven’t missed too much in the documentation then, I’ll look > more into replica placement rules. > > Thank you so much again for your time and help. > > Marko > > > On 10/23/17, 4:33 PM, "Erick Erickson" <erickerick...@gmail.com> wrote: > > Well, first you can explicitly set legacyCloud=true by using the > Collections API CLUSTERPROP command. I don't recommend this, mind you, > as legacyCloud will not be supported forever. > > I'm not following something here though. When you say: > "The desired final state of a such a deployment is a fully configured > cluster ready to accept updates." > are there any documents already in the index or is this really a new > collection? > > and "adding new nodes requires explicit configuration" > > Not sure what you mean here. Configuration of what? Just spinning up > a Solr node pointing to the right ZooKeeper should be sufficient, or > I'm not understanding at all. > > If not, your proposed outline seems right with one difference: > "if a node needs to be added: provision a machine, start up Solr, use > ADDREPLICA from Collections API passing shard number and coreNodeName" > > coreNodeName isn't something you ordinarily need to bother with. I'm > being specific here where coreNodeName is usually something like > core_node7. I suspect you're really talking about the "node" parameter > to ADDREPLCIA, something like: 192.168.1.32:8983_solr, the entry from > live_nodes. > > Now, all that said you may be better off just letting Solr add the > replica where it wants, it'll usually put a new replica on a node > without replicas so specifying the collection and shard should be > sufficient. Also, note that there are replica placement rules that can > help enforce this kind of thing. > > Best, > Erick > > On Mon, Oct 23, 2017 at 3:12 PM, Marko Babic <babma...@abebooks.com> > wrote: > > Hi everyone, > > > > I'm working on upgrading a set of clusters from Solr 4.10.4 to Solr > 7.1.0. > > > > Our deployment tooling no longer works given that legacyCloud defaults > to false (SOLR-8256) and I'm hoping to get some advice on what to do going > forward. > > > > Our setup is as follows: > > * we run in AWS with multiple independent Solr clusters, each with > its own Zookeeper tier > > * each cluster hosts only a single collection > > * each machine/node in the cluster has a single core / is a replica > for one shard in the collection > > > > We bring up new clusters as needed. This is entirely automated and > basically works as follows: > > * we first provision and set up a fresh Zookeeper tier > > * then, we provision a Solr bootstrapper machine that uploads > collection config, specifies numShards and starts up > > * it's then easy provision the rest of the machines and have them > automatically join a shard in the collection by hooking them to the right > Zookeeper cluster and specifying numShards > > * if a node needs to be added to the cluster we just need to spin a > machine up and start up Solr > > > > The desired final state of a such a deployment is a fully configured > cluster ready to accept updates. > > > > Now that legacyCloud is false I'm not sure how to preserve this pretty > nice, hands-off deployment style as the bootstrapping performed by the first > node provisioned doesn't create a collection and adding new nodes requires > explicit configuration. > > > > A new deployment procedure that I've worked out using the Collections > API would look like: > > * provision Zookeeper tier > > * provision all the Solr nodes, wait for them all to come up > > * upload collection config + solr.xml to Zookeeper > > * create collection using Collections API > > * if a node needs to be added: provision a machine, start up Solr, > use ADDREPLICA from Collections API passing shard number and coreNodeName > > > > This isn’t a giant deal to build but it adds complexity that I'm not > excited about as deployment tooling needs to have some understanding of what > the global state of the cluster is before being able to create a collection > or when adding/replacing nodes. > > > > The questions I was hoping someone would have some time to help me with > are: > > > > * Does the new deployment procedure I've suggested seem reasonable? > Would we be doing anything wrong/fighting best practices? > > * Is there a way to keep cluster provisioning automated without > having to build additional orchestration logic into our deployment tooling > (using autoscaling, or triggers, or something I don’t know about)? > > > > Apologies for the wall of text and thanks. :) > > > > Marko > > > >