On 9/30/24 09:39, Default User wrote:
Hi!

On a thread at another mailing list, someone mentioned that they, each
day, alternate doing backups between two external usb drives. That got
me to thinking (which is always dangerous) . . .

I have a full backup on usb external drive A, "refreshed" daily using
rsnapshot. Then, every day, I use rsync to make usb external drive B an
"exact" copy of usb external drive A. It seemed to be a good idea,
since if drive A fails, I can immediately plug in drive B to replace
it, with no down time, and nothing lost.

But of course, any errors on drive A propagate daily to drive B.

So, is there a consensus on which would be better:
1) continue to "mirror" drive A to drive B?
or,
2) alternate backups daily between drives A and B?


I migrated my data to a dedicated ZFS file server several years ago, in part due to advanced ZFS backup features -- snapshots, compression, de-duplication, replication, etc.. I used FreeBSD, but Debian has ZFS and should be able to do the same thing.


My live server has a ZFS pool with two striped mirrors of two 3 TB HDD's each and a special mirror of two 180 GB SSD's:

2024-09-30 16:44:38 toor@f5 ~
# zpool iostat -v p5
                                  capacity     operations     bandwidth
pool                            alloc   free   read  write   read  write
------------------------------  -----  -----  -----  -----  -----  -----
p5                              3.19T  2.39T     49      2  28.4M  69.2K
  mirror-0                      1.58T  1.14T     21      0  14.0M  10.7K
    gpt/hdd1.eli                    -      -      8      0  6.99M  5.35K
    gpt/hdd2.eli                    -      -     12      0  6.99M  5.35K
  mirror-1                      1.58T  1.13T     20      0  14.0M  10.4K
    gpt/hdd3.eli                    -      -     10      0  7.00M  5.20K
    gpt/hdd4.eli                    -      -      9      0  7.00M  5.20K
special                             -      -      -      -      -      -
  mirror-2                      29.4G   120G      7      2   408K  48.1K
    gpt/ssd1.eli                    -      -      3      1   204K  24.1K
    gpt/ssd2.eli                    -      -      3      1   204K  24.1K
------------------------------  -----  -----  -----  -----  -----  -----


The 'special' SSD mirror stores metadata, which improves overall performance.


I create ZFS filesystems for groups of data -- Samba users, CVS repository, rsync(1) backups of various non-ZFS filesystems, raw disk image backups, etc..


ZFS has various properties that you can tune for each filesystem. Here is the filesystem for my Samba data:

2024-09-30 16:50:07 toor@f5 ~
# zfs get all p5/samba/dpchrist | sort | egrep 'NAME|inherited'
NAME               PROPERTY               VALUE                      SOURCE
p5/samba/dpchrist atime off inherited from p5 p5/samba/dpchrist com.sun:auto-snapshot true inherited from p5 p5/samba/dpchrist compression on inherited from p5 p5/samba/dpchrist dedup verify inherited from p5 p5/samba/dpchrist mountpoint /var/local/samba/dpchrist inherited from p5/samba p5/samba/dpchrist special_small_blocks 16K inherited from p5


'atime' is off to eliminate metadata writes when files and directories are read.


'com.sun:auto-snapshot' is true so that zfs-auto-snapshot(8) run via crontab(1) will find this filesystem, take snapshots periodically (daily, monthly, yearly), and manage (prune) those snapshots:

2024-09-30 16:54:00 toor@f5 ~
# crontab -l
 9 3 * * * /usr/local/sbin/zfs-auto-snapshot -k d 40
21 3 1 * * /usr/local/sbin/zfs-auto-snapshot -k m 99
27 3 1 1 * /usr/local/sbin/zfs-auto-snapshot -k y 99


I currently have 96 snapshots (e.g. backups) of the above filesystem going back three and a half years:

2024-09-30 16:59:48 dpchrist@f5 ~
$ ls -d /var/local/samba/dpchrist/.zfs/snapshot/zfs-auto-snap_[dmy]* | wc -l
      96

2024-09-30 17:01:12 dpchrist@f5 ~
$ ls -dt /var/local/samba/dpchrist/.zfs/snapshot/zfs-auto-snap_[dmy]* | tail -n 1
/var/local/samba/dpchrist/.zfs/snapshot/zfs-auto-snap_m-2020-03-01-00h21


'compression' is on so that compressible files are compressed. (The default compression algorithm will skip files that are incompressible.)


'dedup' is on so that duplicate blocks are saved only once within the pool. De-duplication metadata is stored on the pool 'special' SSD mirror, which improves de-duplication performance.


'special_small_blocks' is set to 16K so that files of size 16 KiB and smaller are stored on the pool 'special' SSD mirror, which improves small file read and write performance.


I have a backup server with matching pool construction. I periodically replicate live server snapshots to the backup server (via SSH pull and pre-shared keys). I would like to automate this task.


Both servers have SATA HDD mobile rack bays:

https://www.startech.com/en-us/hdd/drw150satbk


I have a pair of 6 TB HDD's in corresponding mobile rack trays, one for near-site backups and one off-site backups. Each HDD contains one ZFS pool. I periodically insert the near-site HDD into the backup server and replicate the live server snapshots to the removable HDD. I periodically rotate the near-site HDD and the off-site HDD.


Be warned that ZFS has a non-trivial learning curve. I suggest the Lucas books if you are interested:

https://mwl.io/nonfiction/os#fmzfs

https://mwl.io/nonfiction/os#fmaz


David

Reply via email to