On 8/5/2020 1:51 PM, deloptes wrote:
Imagine you have classical backup: daily incrementals, full weekly and full
monthly.
This is not required with DAR. One full backup is all that is
required. Anything else is a waste of time and space. One can employ
either differential or decremental backups, keeping a full history of
all activity on the system.
Imagine you have retention for the full weekly 3 (until end of
month) and for full monthly 12 (until end of year).
You have to maintain 15 full backups and the 6 daily incrementals. How much
space is it, that you need for your backup storage?
This is why the question what is your active size. No imagine I have 2TB of
data, even if I compress this data - lets say with avg. 60% ratio it is
800GB per full backup. 15 copies + means 12TB+. Of course if you have
video/audio like mp3, it is already compressed and ratio for the backup
compression goes down and space needed up.
Now here comes the trick with deduplication. The backup system makes one
full backup (800GB) and then keeps track of the bits that changed (it is
not that simple, but for the example). Only they are being backuped. Some
systems provide ratio of 90%. So to keep your 15+ copies with deduplication
ratio of ~80% you need about 3TB.
You are talking about differential archiving at the binary level. DAR
can employ delta binaries for both incremental and decremental backups,
which means only the changes to the binary data, if any, is stored in
subsequent backups, not the entire file. There is no need for multiple
full backups. A full backup of my main array to hard drive media will
take a minimum of 15 days. Transferring across the 10G link to my
backup server would take a minimum of 2.8 days. Although full backups
of smaller systems will normally take less time, the fact is full
backups on a regular basis are just not necessary, and especially not
when using DAR.
volume size is 46.9 Terabytes (43.6 Teribytes) after formatting. ??The
main server currently has 22 Terabytes of data on it. ??The backup server
is effectively full.
So you have a perfect candidate for deduplication :) because I guess you can
keep only few copies of that size on the backup server.
A few copies of what size? The backup server is an exact mirror of the
main server, plus several T of additional files I don't need on the main
server. My DAR backups include new files (typically a few hundred GB)
plus changes to files (typically a few MB). Borg isn't really offering
anything DAR does not, except DAR is not server based. If I wanted to
run DAR on my backup server, I could, but I don't want to. I use DAR
for offline archival storage. Thew backup array is a mirror, not an
archive.
In the case of a catastrophic failure of the main server or its array,
I simply bring up the backup server as a replacement until such time -
days later - as the main server is rebuilt. In the case of an
accidentally lost or corrupted file(s), I just do a cp from the backup
to the main server, restoring the files to the state they were in as of
04:00 this morning. Typically, this takes a few seconds.