I've been using bacula for many years, but as the volume of our data
has grown and we've gotten a new tape library, I'm about to implement
a new strategy for our backup jobs and I'd like your feedback.
Environment: scientific research center
Data volume: ~400TB
Growth rate: ~20TB/month (new data)
Churn rate: ~10TB/month (total size of files that exist and
change in content but not significantly in size)
Backup device: tape library, 3x LTO8 drives, 80x LTO tapes
Backup window: undefined
Restore window: undefined
Bacula version: 9.6.7
We're using the GPFS filesystem, and doing filesystem snapshots every
15 minutes, with a limited set retained for at least 2 months. The
snapshots allow for almost instant restores of recent data and comparision
between different versions of files, without system administrator
intervention.
Because of snapshots, I'm planning to eliminate all nightly incremental
& differential backups to tape. Tape backups would be only for
archival/disaster-recovery purposes and for compliance with grant and
data management requirements.
The new strategy would be to do a full backup every 2 months, kept for
5 months. One backup would be kept for at least 2 years, the others would
be rotated (media reused). For example:
January 2021 keep until January 2023
March 2021 keep until August 2021
May 2021 keep until October 2021
July 2021 keep until December 2021
September 2021 re-use March 2021 media, keep until February
2022
November 2021 re-use May 2021 media, keep until April 2022
January 2022 keep until January 2024
All tape backups would be done from a snapshot, so that no files within
the source of the backup change during the process. A "run before job"
script would dump coherent copies of databases, then create a filesystem
snapshot dedicated to the backup. That snapshot would be removed when
the backup is complete.
We've got about 700 top-level directories for user accounts and research
projects. We'll probably run an individual backup job for each group of
directories alphabetically (A*, B*, etc), so that the 400TB will be spread
(unevenly) across about 45 Bacula jobs.
Thoughts? Suggestions?
Thanks,
Mark
--
Mark Bergman voice: 215-746-4061
[email protected] fax: 215-614-0266
http://www.med.upenn.edu/cbica/
IT Technical Director, Center for Biomedical Image Computing and Analytics
Department of Radiology University of Pennsylvania
_______________________________________________
Bacula-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-users