: The ReplicationHandler still works when you use SolrCloud, right? can't you : just replicate from one (or N, depending on the number of shards) of the : nodes in the cluster? That way you could keep a Solr instance that's only : used to replicate the indexes, and you could have it somewhere else (other
if you only replicated from one node in the cluster, you would only get backups of the shards that exist on that cluster -- not any shards that only exist on other machines. I think that's what Tommaso was suggesting: a tool/client that could ask ZK about the cluster state, and then use that to generate a list of collection => shards+nodes so that it could ensure it SnapPulled from some node a copy of every shard for every collection. Of course: if your collections are big enough that you are sharding, trying ot have a single backup server probably wouldn't be viable anyway, so a tool like that would need options to split the work up. An alternate strategy might be to leverage the existing backup functionality of the ReplicatoinHandler, but add logic to make it zk/cloud aware, so that a single request to "backup" for a collection would propogate to all of the shard leaders to (delegate to a node to) backup that shard -- then you just need to configure the backup location for the ReplicationHandler to be a directory that is on your NAS. -Hoss