anmolnar opened a new pull request, #8044:
URL: https://github.com/apache/hbase/pull/8044

   Hi all,
   
   We would like to propose merging the feature “Read Replica Cluster” into 
   the main branch.
   
   **Background**
   
   We’d like to implement the open source version of Amazon’s [Read Replica 
   Cluster on 
S3](https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/)
 feature for Apache HBase. It adds the ability of running 
   another HBase cluster on the same cloud storage location in read-only mode, 
   allowing users to share the read workload between multiple clusters. Due 
   to the characteristics of the implementation and the lack of automated 
   synchronization between the active and read-replica clusters, read replicas 
   are eventually consistent, hence they’re not suitable for reading most 
   recent data. However we still believe that users of open source Apache HBase 
   could take advantage of this feature and there are use cases out there which 
   read replicas could help with. Please find more information about the 
   feature in the linked blog post.
   
   **Pros**
   
   - Running multiple clusters in different Availability Zones adds HA to the 
   entire workload,
   - No need for data movement or duplication (active-active replication setup) 
   which is cost and time efficient,
   - No limit for the number of read replica clusters
   
   **Cons**
   
   - Read Replica clusters are eventually consistent: in memory data is not 
   visible from read replicas,
   - Read Replica clusters must be manually refreshed: flush on active cluster, 
   refresh hfiles/meta on read replicas
   
   A detailed description of the design and implementation can be found in the
   following document.
   
   [Apache HBase Read Replica Cluster 
Feature](https://docs.google.com/document/d/1EI0lsURX1BZhv3DYgMvZCl4EUy-ADJRkHUc1PjzZtj0/edit?usp=sharing)
   
   Please review and share your feedback or comments.
   
   Best regards,
   Andor


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to