[jira] [Comment Edited] (HBASE-29521) Update Restore Command to Handle Bulkloaded Files

Vinayak Hegde (Jira) Tue, 26 Aug 2025 06:49:04 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-29521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18016307#comment-18016307
 ]


Vinayak Hegde edited comment on HBASE-29521 at 8/26/25 1:48 PM:
----------------------------------------------------------------

As part of PITR, in addition to restoring the full and incremental backups, we 
also need to replay WAL edits for the remaining duration.

Example:
 * st = timestamp of the last backup restored as part of PITR

 * et = user-specified point-in-time for PITR

We need to replay WAL edits from st -> et, and also re-bulkload any bulkloaded 
files that fall within this range.

Our current backup directory structure:
{code:java}
-- wal_backup_directory/
     -- WALs/
          -- 23-08-2025/
               ... wal files
          -- 24-08-2025/
          -- 25-08-2025/
     -- bulk-load-files/
          -- 23-08-2025/
               ... bulkload files
          -- 24-08-2025/
          -- 25-08-2025/
{code}
*requirement*
 * Read WAL files from st → et, extract all bulkload file paths from the WAL 
edits, and feed them into the existing RestoreJob (MR job) to bulkload them 
into tables. RestoreJob logic is already implemented.

 * The main challenge is to efficiently extract bulkload entries from WALs.

*Options*

New MR job
 * 
 ** Create a job to re-read the WALs (we already replay them with WALPlayer) 
and collect all bulkload entries between st → et.

 * 
 ** Write them to a file, then feed that list into the existing RestoreJob.

 * 
 ** Downside: WALs are read twice.

System table tracking
 * 
 ** Store all bulkload entries (with timestamps) in the backup system table as 
they occur.

 * 
 ** At PITR restore, simply query entries between st → et.

 * 
 ** Downside: introduces additional dependency on the system table, which we 
would like to avoid in the long run.

What do you guys think? how should we handle this?

[~andor] [~swu] [~ssa] [~ankit.jhil] 


was (Author: JIRAUSER298877):
As part of PITR, in addition to restoring the full and incremental backups, we 
also need to replay WAL edits for the remaining duration.

Example:
 * st = timestamp of the last backup restored as part of PITR

 * et = user-specified point-in-time for PITR

We need to replay WAL edits from st -> et, and also re-bulkload any bulkloaded 
files that fall within this range.

Our current backup directory structure:
{code:java}
-- wal_backup_directory/
     -- WALs/
          -- 23-08-2025/
               ... wal files
          -- 24-08-2025/
          -- 25-08-2025/
     -- bulk-load-files/
          -- 23-08-2025/
               ... bulkload files
          -- 24-08-2025/
          -- 25-08-2025/
{code}
*requirement*
 * Read WAL files from st → et, extract all bulkload file paths from the WAL 
edits, and feed them into the existing RestoreJob (MR job) to bulkload them 
into tables. RestoreJob logic is already implemented.

 * The main challenge is to efficiently extract bulkload entries from WALs.

*Options*
 # New MR job

 ** Create a job to re-read the WALs (we already replay them with WALPlayer) 
and collect all bulkload entries between st → et.

 ** Write them to a file, then feed that list into the existing RestoreJob.

 ** Downside: WALs are read twice.

 # System table tracking

 ** Store all bulkload entries (with timestamps) in the backup system table as 
they occur.

 ** At PITR restore, simply query entries between st → et.

 ** Downside: introduces additional dependency on the system table, which we 
would like to avoid in the long run.

What do you guys think? how should we handle this?

[~andor] [~swu] [~ssa] [~ankit.jhil] 

> Update Restore Command to Handle Bulkloaded Files
> -------------------------------------------------
>
>                 Key: HBASE-29521
>                 URL: https://issues.apache.org/jira/browse/HBASE-29521
>             Project: HBase
>          Issue Type: Sub-task
>          Components: backup&amp;restore
>            Reporter: Vinayak Hegde
>            Assignee: Vinayak Hegde
>            Priority: Major
>
> Enhance the restore command to replay WAL edits first, then bulkload HFiles 
> from the backup location. Ensure PITR restore correctness and handle cases 
> where bulkloaded files are referenced in WALs. Validate the presence of all 
> required files before restore execution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HBASE-29521) Update Restore Command to Handle Bulkloaded Files

Reply via email to