OK, I've written a quick & dirty utility to do this to increments after
the fact. I've posted it in the wiki at the bottom of ContribScripts,
but as I noted there it's probably very dicey to use and it needs real
testing. Anyone care to give it a go? ;)
It's my hope that this can one day be an alternative to
"--remove-older-than" --- perhaps "--move-older-than".
An interesting thing about the output tarballs from my script: if I
rdiff two of them, one of them plus the patch file is significantly
smaller than two of them (presumably because diffs on different days are
nonetheless similar).* This is probably very dependent on what kind of
data is being backed up, but it may lead to a way to make increment
storage even more efficient (but also more fragile, since a restore
would take two levels of merging). It's also very possible that this is
a clear indication that I've done something very wrong in my script
that's causing duplicate data in what are supposed to be separate
increments. Further testing is required ;)
*Example from my test set: a collected increment from 2008-10-04 is
49MB, and the one from 2008-10-05 is also 49MB (total 98MB). An rdiff
delta file to turn 2008-10-04 into 2008-10-05 is only 18MB, so
2008-10-04 plus the delta file is 67MB. Another delta to turn 2008-10-05
into 2008-10-06 is also only 18MB, so the three of them together are
85MB instead of 147MB. Again, this is probably highly dependent on the
kind of data that's in these increments, but I'm surprised it works as
well as it does given that I'm tarring some already-gzipped files together.
~Felix.
On 07/03/09 15:03, Marcel (Felix) Giannelia wrote:
Is there any way of making rdiff-backup produce single files as
increments (say, by zipping them together when it produces them),
instead of thousands of itty bitty files? One file per increment would
make the task of moving old ones onto archive DVD a lot easier, and
would make a lot less hardship for the target machine's filesystem,
too. It probably wouldn't slow down restores all that much, as
accessing an archive file's directory structure is likely faster than
doing the same in a part of the filesystem containing many thousands
of files per directory.
Presently, I'm trying to do a du -s on our backup directories and it
sat there for over an hour without having printed the size of the
first one. According to top, du was using 50% of the total memory. I
know that there are statistics files which I could add together, but
in this case I want to use du to be sure because there's a chance that
there might be stray files in our rdiff-backup-data directory. Also,
creating so many files that commands like du cannot even function is,
in my opinion, incorrect behaviour.
~Felix.
_______________________________________________
rdiff-backup-users mailing list at [email protected]
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL:
http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
_______________________________________________
rdiff-backup-users mailing list at [email protected]
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki