Ankit Singhal created HBASE-28957: ------------------------------------- Summary: Adding support for continuous Backup and Point-in-Time Recovery Key: HBASE-28957 URL: https://issues.apache.org/jira/browse/HBASE-28957 Project: HBase Issue Type: Umbrella Components: backup&restore Affects Versions: 3.0.0-alpha-4, 2.6.0 Reporter: Ankit Singhal
Current solutions like replication and snapshots offer data redundancy but have limitations that prevent effective point-in-time recovery in cases of data corruption or accidental changes. Replication requires maintaining a live cluster that mirrors the original, which incurs substantial costs to keep both clusters operational. Snapshots, on the other hand, do not support point-in-time recovery, leading to potential data loss between snapshots. Incremental snapshots improve this situation but still do not provide full protection, as they only capture data at specific intervals. Limitations of the Current Incremental Backup Solution The current incremental backup solution in HBase has several critical limitations that highlight the need for continuous backup and PITR: • Risk of Data Loss: Since incremental backups are created in batches rather than continuously, any changes made since the last backup are at risk of being lost if data corruption or deletion occurs before the next scheduled backup. • Restore Point Limitations: Users can only restore data to specific backup timestamps rather than any exact moment in time, restricting flexibility and the ability to revert to the most recent stable state before an issue. • WAL Management Challenges: Write-Ahead Logs on the source cluster cannot be archived until the backup process completes, making WAL management complex and storage-intensive on the source cluster. • Complex Backup Tracking: Managing backup IDs, job history, and logs is currently challenging, requiring substantial manual tracking and oversight to ensure consistency. • Dependency on YARN: The incremental backup process relies on a YARN cluster to move WALs, adding both resource dependency and complexity to the backup workflow. -- This message was sent by Atlassian Jira (v8.20.10#820010)