I think there's two questions here: (1) what are general good backup practices, and (2) how to backup svn repos specifically.
"If we lose it, we're all out of work." Hopefully your boss recognizes this and has budgeted appropriately. In my experience there is no perfect backup; the best you can do is ever-decreasing odds of a catastrophic failure. Step one would be to run your svn server with some kind of redundant disk configuration. Of course we all know RAID is not backup, but storage is relatively cheap, so why not? I'd then backup to at least two different machines, preferably offsite. Cloud storage is fairly cheap these days as well. So a good scheme might be (at least) one offsite server that you control, and (at least) a second copy with a cloud provider (CrashPlan, BackBlaze, Amazon, DropBox, etc). What we've always done is a simple rsync of the repo tree. Your email made me realize that we could be doing the backup right when someone is committing, and thus ending up with a corrupt repo tree. However, we have some mitigating factors: we don't have just one repo, but literally dozens. And we do backups twice per week, and we keep several months of backups. So my collection of backups probably does have some corrupt repo trees... but given the number of repos we have, plus the fact that the backup jobs run in the middle of the night/weekend, I think the probability is pretty low that I have any significant corruption. As you suggested, if you can make a fancier backup script that shuts down anyone's ability to make changes to the repo while the backup is taking place, that's even better. For my personal svn repos (home hobby projects) I do simple backups with svndump. Lastly, you probably owe it to your company to regularly test your backups to ensure that they are indeed viable. Just like buildings have fire drills, so should sysadmins have DR drills. Hope these suggestions are useful! On Wed, Jun 1, 2016 at 9:58 AM, Michael Schwager <mschw...@gmail.com> wrote: > Hello, > We are very paranoid about our Subversion repo, notwithstanding the fact > that the previous sysadmin didn't back it up. But that's another story. Now > I'm here at my job, I've inherited the repo admin duties, and I want to back > it up reliably. If we lose it, we're all out of work. > > My question is: How do I back it up reliably, and verify it so that I can > deliver a 100% recovery guarantee to my boss? I have Subversion 1.8.4 on a > CentOS 6.3 server, and Tortoise SVN 1.8.11 on Windows 7 clients. > > I am thinking to do both an svn hotcopy to one directory, and an rsync to > another. The svn hotcopy will give me a backup that I'm pretty sure is > reliable (see Notes below). Assuming httpd is down and I can guarantee that > I am the only person who will be logged into the SVN server, can I expect > with 99.9% surety that the svn repos are quiescent? > > Thanks. > -- > -Mike Schwager > > Notes: > > We're a little worried about svn hotcopy; we ran into a bug that came about > under 1.8 when working with older repos; the hotcopy exits with the > following error: > > svnadmin: E200002: Serialized hash missing terminator > > I have compiled subversion-1.9.4 on the server under /opt/subversion-1.9.4. > If I run that version of svn hotcopy, it appears to work and svnverify exits > successfully. But if I look at all the files under both the original and the > hotcopy on one of our repos, I find that a file is missing: > repos2/db/rev-prop-atomics.shm . That's probably ok, but still- how do we > know the latest hotcopy, and hotcopies of the future, are and will remain > 100% bug-free?