[
https://issues.apache.org/jira/browse/HBASE-29219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17951827#comment-17951827
]
Vinayak Hegde commented on HBASE-29219:
---------------------------------------
As we can see here:
[https://github.com/apache/hbase/blob/d9b1aa108960bafcb5edaa676833d85a6025c4a9/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/WALInputFormat.java#L310]
HBase already has a mechanism to skip missing files while processing the list
of input files.
We could adopt a similar approach to handle empty files.
Specifically, we can introduce a new configuration to allow skipping empty
files. If this configuration is enabled and a file is found to be empty (e.g.,
file length is 0), we can log a warning and continue processing the rest.
> Ignore Empty WAL Files While Consuming Backed-Up WAL Files
> ----------------------------------------------------------
>
> Key: HBASE-29219
> URL: https://issues.apache.org/jira/browse/HBASE-29219
> Project: HBase
> Issue Type: Task
> Components: backup&restore
> Reporter: Vinayak Hegde
> Assignee: asolomon
> Priority: Major
>
> Currently, {{ContinuousBackupReplicationEndpoint}} creates WAL files in the
> backup location and continuously writes incoming WAL entries.
> These WAL files are consumed during *Point-In-Time Recovery (PITR)* and
> {*}incremental backups{*}. However, an edge case may arise:
> If we attempt to access a WAL file that is currently {*}empty{*}—for
> instance, when the *Continuous Backup Replication Endpoint* closes a file,
> reopens a new one for writing, but has not yet written any data—an exception
> occurs. The WAL file reader throws an error indicating that the {*}WAL PB
> magic is missing{*}.
> To handle this, we need to *skip/ignore empty WAL files* and continue
> processing the remaining files without interruption.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)