Thanks for the confirmation Shawn. Distributed systems are hard, so this makes sense.
I have a large, stable cluster (stable in terms of leadership and performance) with a single shard. The cluster scales up and down with additional PULL replicas over the day with the traffic curve. It's going to take a bit of coordination to get all nodes to mount a shared volume when we take a backup and then unmount when done. Any idea what happens if a node joins or leaves during a backup? On Thu, 31 May 2018 at 06:14, Shawn Heisey <apa...@elyograg.org> wrote: > On 5/29/2018 3:01 PM, Greg Roodt wrote: > > What is the best way to perform a backup of a Solr Cloud cluster? Is > there > > a way to backup only the leader? From my tests with the collections admin > > BACKUP command, all nodes in the cluster need to have access to a shared > > filesystem. Surely that isn't necessary if you are backing up the leader > or > > TLOG replica? > > If you have more than one Solr instance in your cloud, then all of those > instances must have access to the same filesystem accessed from the same > mount point. Together, they will write the entire collection to various > subdirectories in that location. > > I can't find any mention of whether backups are load balanced across the > cloud, or if they always use leaders. I would assume the former. If > that's how it works, then you don't know which machine is going to do > the backup of a given shard. Even if the backup always uses leaders, > you can't always be sure of where a leader is. It can change from > moment to moment, especially if you're having stability problems with > your cloud. > > At restore time, there's a similar situation. You don't know which > machine(s) in the cloud are going to be actually loading index data from > the backup location. So they all need to have access to the same data. > > Thanks, > Shawn > >