[ 
https://issues.apache.org/jira/browse/HBASE-29776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18044757#comment-18044757
 ] 

Hernan Gelaf-Romer edited comment on HBASE-29776 at 12/12/25 7:42 PM:
----------------------------------------------------------------------

[~dieterdp_ng] Can you help sanity check me a little? I feel that this 
conditional shouldn't exist at all. I don't think we should ever filter out an 
oldWAL for backing up. I can see two scenarios for data loss here

 
 # The conditional just seems wrong, and contradicts the comment. If the 
newTimestamp is null (and therefore the RS is obsolete), we filter it out from 
being backed up
 # newTimestamp represents the last time a backup rolled the WAL on the RS. Why 
would we filter out oldWALs that have been created since then? That seems wrong 
as well

 

The only oldWALs we should exclude are ones that have a ts older than our 
oldTimestamp imo. We're guaranteed to have those WALs in a previous backup. 
This is something the code already does


was (Author: JIRAUSER290343):
[~dieterdp_ng] Can you help sanity check me a little? I feel that this 
conditional shouldn't exist at all. I don't think we should ever filter out an 
oldWAL for backing up. I can see two scenarios for data loss here

 
 # The conditional just seems wrong, and contradicts the comment. If the 
newTimestamp is null (and therefore the RS is obsolete), we filter it out from 
being backed up
 # newTimestamp represents the last time a backup rolled the WAL on the RS. Why 
would we filter out oldWALs that have been created since then? That seems wrong 
as well

> Log filtering in IncrementalBackupManager can lead to data loss
> ---------------------------------------------------------------
>
>                 Key: HBASE-29776
>                 URL: https://issues.apache.org/jira/browse/HBASE-29776
>             Project: HBase
>          Issue Type: Bug
>          Components: backup&restore
>            Reporter: Hernan Gelaf-Romer
>            Priority: Major
>
> At the moment, incremental backups will filter out old wals that belong to RS 
> which do not have any active WAL files. 
>  
> The code 
>  
> {code:java}
> // It is possible that a host in .oldlogs is an obsolete region server
> // so newestTimestamps.get(host) here can be null.
> // Even if these logs belong to a obsolete region server, we still need
> // to include they to avoid loss of edits for backup.
> Long newTimestamp = newestTimestamps.get(host);
> if (newTimestamp == null || currentLogTS > newTimestamp) {
>   newestLogs.add(currentLogFile);
> }{code}
> Is doing the opposite of what the comment (correctly) says. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to