Re: [PR] HBASE-28882 Backup restores are broken if the backup has moved locations [hbase]

via GitHub Wed, 25 Sep 2024 05:46:00 -0700


rmdmattingly commented on PR #6294:
URL: https://github.com/apache/hbase/pull/6294#issuecomment-2373984553


   > No, my testcase was "can a fresh HBase cluster restore a backup".
   
   👍 
   
   >  fresh cluster won't have any record stored of where backups were 
previously stored. So why is this an issue for a cluster that created the 
now-moved backup?
   
   This is a good question that I don't have a _certain_ answer to because the 
test that I'm running internally actually restores to the existing cluster 
in-place (it drops the existing table, and then restores from backup).
   
   But I believe it would still be a problem for a restore to a new cluster 
because we don't load the backup manifest from metadata on the HBase cluster — 
[we load the backup manifest from its on disk persistence in the given 
backup](https://github.com/apache/hbase/blob/bcbd12980942946d48adb40898746d8239230464/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupManifest.java#L429-L442).
 
   
   ```java
             // load and set manifest field from file content
             long len = subFile.getLen();
             byte[] pbBytes = new byte[(int) len];
             try (FSDataInputStream in = fs.open(subFile.getPath())) {
               in.readFully(pbBytes);
             } catch (IOException e) {
               throw new BackupException(e.getMessage());
             }
             BackupProtos.BackupImage proto = null;
             try {
               proto = BackupProtos.BackupImage.parseFrom(pbBytes);
             } catch (Exception e) {
               throw new BackupException(e);
             }
   ```
   
   For example:
   1. cluster-a takes a backup and persists it to `/primary/root/dir`
   2. we move that backup to `/secondary/root/dir`, verbatim, via S3 
replication (or something similar)
   3. we decide we want to restore this backup onto cluster-b
   4. we send a restore request to cluster-b for this backup, with the root dir 
specified as `/secondary/root/dir`
   5. the restore will read the BackupManifest file from `/secondary/root/dir`, 
as intended, but in the aforementioned code we'll blindly parse a proto that 
has the root dir listed as `/primary/root/dir`, both for its own BackupImage 
and all of the ancestor images
   6. the restore will now run awry with an unclear/inconsistent root dir. 
Maybe it would succeed if your primary root dir still worked, the backup is 
still there, and your permissions allow access to both — but I don't think that 
"success" really aligns with the operator's intention of passing in the 
secondary root. And, more likely, it will fail because you've passed in a 
secondary root for a reason (perhaps the primary root dir is down, due to a 
region outage for example. Or perhaps you just enforce good least privilege)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] HBASE-28882 Backup restores are broken if the backup has moved locations [hbase]

Reply via email to