I don’t think an autoscaling group is the right way to bring back a Zookeeper node. ZK nodes have identity. This is key to the operation of ZK. You cannot just swap in a new random node. It doesn’t “scale up”.
Size the ZK cluster for the number of anticipated failures. I like a five node cluster which can handle two failures. That allows a random failure while a node is down for maintenance. If you lose a node, configure a new one to replace it, with the right ID in the Zookeeper config files, then bring it back up as the same hostname. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Feb 24, 2019, at 11:22 PM, Addison, Alex (LNG-LON) > <alex.addi...@lexisnexis.co.uk.INVALID> wrote: > > Hi all, we're looking at how to run Solr & Zookeeper in production. We're > running everything in AWS, and for resiliency we're using Exhibitor with > Zookeeper and keeping Zookeeper in an auto-scaling group just to re-create > instances that are terminated for whatever reason. > Unfortunately it's not simple to set this up so that Zookeeper retains a > fixed IP or DNS name through such re-creation (i.e. the new virtual machine > will have a new name and IP address); is there a way to inform Solr that the > set of Zookeeper nodes it should talk to has changed? We're using Solr Cloud > 7.7. > > Thanks, > Alex Addison > > > ________________________________ > > LexisNexis is a trading name of RELX (UK) LIMITED - Registered office - 1-3 > STRAND, LONDON WC2N 5JR > Registered in England - Company No. 02746621