Hi, This procedure looks fine but it is a little complexe to automatize.
Why not consider backup based on CDCR for Solrcloud or Replication for Solr standalone ? For Solrcloud, CDCR can be configured with source and target collections in the same Solrcloud cluster. The target collection can have their shards located in dedicated nodes and replication factor set to 1. You need to be careful of locating target nodes on separate hardware (VM and storage) and ideally in separate geographical locations. You will be able to achieve very good RPO and RTO. If RTO is not high, the dedicated nodes for backup destination can have few CPU and RAM If RTO is high we can imagine the backup becomes the live collection very fast instead of restore or in degraded search only mode during restore. Regards. Dominique Le jeu. 6 août 2020 à 16:18, Bram Van Dam <bram.van...@intix.eu> a écrit : > Hey folks, > > Been reading up about the various ways of creating backups. The whole > "shared filesystem for Solrcloud backups"-thing is kind of a no-go in > our environment, so I've been looking for ways around that, and here's > what I've come up with so far: > > 1. Stop applications from writing to solr > > 2. Commit everything > > 3. Identify a single core for each shard in each collection > > 4. Snapshot that core using CREATESNAPSHOT in the Collections API > > 5. Once complete, re-enable application write access to Solr > > 6. Create a backup from these snapshots using the replication handler's > backup function (replication?command=backup&commitName=mySnapshot) > > 7. Put the backups somewhere safe > > 8. Clean up snapshots > > > This seems ... too good to be true? I've seen so many threads about how > hard it is to create backups in SolrCloud on this mailing list over the > years, but this seems pretty straightforward? Am I missing some > glaringly obvious reason why this will fail catastrophically? > > Using Solr 7.7 in this case. > > Feedback much appreciated! > > Thanks, > > - Bram >