>  If you are looking for a replacement, I don't know of any that do rdiffs
>  besides rdiff-backup. I think that a good incremental backup would be your
>  best option.

All incrementals (that I know of) waste space when there are large
files where only a small part of the file changes. This is a problem
for me - many of our users have 1-2 GB Outlook files. I don't want to
use up an extra 1GB+ of storage for those users every time I make a
backup.

>
>  It looks like all the stuff with making the hardlinks and temp directory are
>  to avoid a potential conflict between the existing "rdiff-backup-data"
>  directory on backup1 and the other "rdiff-backup-data" directory that gets
>  written to on backup2. If backup1 and backup2 both have rdiff-backup
>  installed then you can do something like
>

The hardlinks and temp directories have nothing (specific) to do with
the interaction between backup1 and backup2. Their only purpose is to
allow me to combine rsync and rdiff-backup.

ie:

1) rsync from remote location to a local path (without breaking
rdiff-backup store)

2) tell rdiff-backup to pull from this local path into it's store, so
I get rdiff-backup history.

The reason for using hard links is to conserve space - if 'files' is
100GB, I don't want the 'temp' directory to also be 100GB. Also, it's
much faster to make hardlinks than to physically copy all the bytes
over from 'files' to 'temp'

I exclude rdiff-backup metadata when I sync from 'files' to 'new' for
a few reasons:

1) The source (where I am rsyncing from) won't (shouldn't) have it, so
it will get erased anyway when I run the rsync.

2) In the event that the source does have a rdiff-backup-data
directory for some reason, I need to exclude it anyway, because it
will cause problems with rdiff-backup.

I use the same method on backup2 (to backup backup1) because I have
the same needs as on backup1 (to backup all the other servers,
workstations, etc):

- I want history of the source
- I want to conserve space

The main difference is that backup2 contains the sum of all files on
backup1, plus their rdiff-backup metadata (in sub-dirs, not under the
root of backup1 where it would cause problems for backup2). In theory
there shouldn't be any major problem with this approach.

rdiff-backup's docs say that a large number of files shouldn't cause
excessive memory usage, but that hardlinks can - if you're hard
linking 1000's of files together.

This fact and our discussion gives me an idea: I think that
rdiff-backup *might* have a bug, where if the source and dest files
are all hardlinked to each other, it will use extra memory.

>  rdiff-backup backup1::/backup/files /backup/path/on/backup2 --exclude
>  **rdiff-backup-data**
>
>  on backup2. This avoids making hardlinks and a temp directory and also avoids
>  your problem of having the two "rdiff-backup-data" directories conflicting.
>

This is worth testing. If my non-standard usage (all those hardlinks
between source and dest files) is causing excessive memory usage on
backup2 (due to a huge number of files), this should fix the problem.
backup1 will also have this problem, but to a lesser degree (it has 90
odd backups, instead of 1 huge single backup).

I'll look into it.

(It will be a bit of a hack, because currently my backup logic is
split into 2 separate steps for each backup (regardless of type -
files, db, etc). First get the data from the source, then
compress/make history/etc. I'll have to add a new 'source' type for
'rdiff-backup', where there is no applicable 'compress' logic)

Thanks for your feedback :-)

David.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to