Thanks Shawn for your detailed reply. It has helped to better my
understanding. Below is my summarised understanding.

In a SolrCloud setup with version less than 6.1, there is no ‘elegant’ way
of handling collection backups and restore. Instead, have to use the manual
backup and restore APIs using replication handler. However, as these APIs
were primarily designed for standalone Solr installations, we can only
backup data stored on a single Solr host for a particular core. Hence, in
order to get the complete collection data backed-up for a SolrCloud
collection, backup API should be used for all the nodes belonging to the
SolrCloud cluster and then manually backup ZooKeeper clusterstate, with
possible tweaking needed to ensure hash value consistency.

Few follow-up questions:
1. In the SolrCloud, as a single host can have information about multiple
shards (either leader or replica), how does the backup API handle the
underlying data copy? I presume it will simply copy the data across ALL the
shards (both leader and replicas) for the specified collection.
2. If I am invoking the backup command periodically to backup the data and
then invoke restore command later (possibly due to cluster shutdown and
create a fresh SolrCloud cluster), I presume I don't need to tinker with
the hash values as long as the default settings have been used in both
backup and restore situations?

Thanks


On 2 June 2018 at 08:59:26, Shawn Heisey (apa...@elyograg.org) wrote:

On 6/2/2018 1:50 AM, Shawn Heisey wrote:
> If you provide a location parameter, it will write a new backup
> directory in that location.
>
>
https://lucene.apache.org/solr/guide/6_6/making-and-restoring-backups.html#standalone-mode-backups
>
> I verified that this parameter is in the 5.5 docs too, I would suggest
> you download that version in PDF format if you want a full reference.

A followup:

I suspect that if you try to use the restore functionality on the
replication handler and have multiple shard replicas, that SolrCloud
would not replicate things properly.  I could be wrong about that, but I
think that restoring from replication handler backups to SolrCloud could
get a little messy.

Thanks,
Shawn

Reply via email to