On 6/1/2016 4:58 PM, Michael Schwager wrote:
Hello,
We are very paranoid about our Subversion repo, notwithstanding the fact that the previous sysadmin didn't back it up. But that's another story. Now I'm here at my job, I've inherited the repo admin duties, and I want to back it up reliably. If we lose it, we're all out of work.

My question is: How do I back it up reliably, and verify it so that I can deliver a 100% recovery guarantee to my boss? I have Subversion 1.8.4 on a CentOS 6.3 server, and Tortoise SVN 1.8.11 on Windows 7 clients.

I am thinking to do both an svn hotcopy to one directory, and an rsync to another. The svn hotcopy will give me a backup that I'm pretty sure is reliable (see Notes below). Assuming httpd is down and I can guarantee that I am the only person who will be logged into the SVN server, can I expect with 99.9% surety that the svn repos are quiescent?

Thanks.
--
-Mike Schwager

Notes:

We're a little worried about svn hotcopy; we ran into a bug that came about under 1.8 when working with older repos; the hotcopy exits with the following error:

svnadmin: E200002: Serialized hash missing terminator
As far as I can tell this indicates a problem in the repository you are trying to hotcopy from. Run svnadmin verify on that to get details where the corruption might be located and resolve that (if possible).

I have compiled subversion-1.9.4 on the server under /opt/subversion-1.9.4. If I run that version of svn hotcopy, it appears to work and svnverify exits successfully. But if I look at all the files under both the original and the hotcopy on one of our repos, I find that a file is missing: repos2/db/rev-prop-atomics.shm . That's probably ok, but still- how do we know the latest hotcopy, and hotcopies of the future, are and will remain 100% bug-free?
To ensure the integrity of a backup close to 100% I go the fail safe way:
1. svnadmin verify to ensure the current repository is in a good state (if there are errors/issues resolve them)
2. svnadmin dump to dump the current repository
3. svnadmin load the dump into a fresh repository
4. svnadmin dump the newly loaded repository
5. compare the first and the last dump
6. run svnadmin verify on the loaded dump to en
If both dumps are equal, I'm certain enough the integrity is given. (Note: this implies the same fsfs-format as well as the same server version for the svnadmin calls).

This process however only works when you can take down access to a repository completely. Otherwise svnadmin hotcopy would be my choice too. In addition setting up a mirror which is kept in sync using svnsync is also a reasonable measurement to further increase the reliability of an SVN repository IMO.

However, it's also utterly vital to keep the server up to date with patch releases, so you are not suffering flaws/bugs which could impact the server side. Some examples for issues which have been fixed a long time ago and are not fixed in 1.8.4 (but might be relevant for your case of ensuring data correctness/integrity and a reliable backup system): 1.8.5: hotcopy: fix hotcopy losing revprop files in packed repos (issue #4448)
1.8.9: svnadmin dump: don't let invalid mergeinfo stop dump
1.8.9: svnrdump load: fix crash when svn:* normalization (issue #4490)
1.8.9: mod_dav_svn: detect out of dateness correctly during commit (issue #4480)
1.8.11: disable revprop caching feature due to cache invalidation problems
1.8.13: svnadmin load: tolerate invalid mergeinfo at r0 (issue #4476)
1.8.13: svnadmin load: strip references to r1 from mergeinfo (issue #4538)
1.8.13: svnsync: strip any r0 references from mergeinfo (issue #4476)
1.8.14: prevent possible repository corruption on power/disk failures
1.8.16: dump: don't write broken dump files in some ambiguously encoded fsfs repositories (issue #4554)

If you feel very conservative you might also wanna consider staying at the old stable version for the server (sidenote: even with the server running at 1.8, clients could use svn 1.9), if you don't need bugfixes/improvements of the current stable svn version. While the current stable build (1.9.4) contains fixes for issues still present in 1.8.16 and also delivers new features, it also contains features/improvements which by the nature of software lifecycles have not been tested as long in the wild as the features present in 1.8.16. So one might argue that from the code integrity point of view, 1.8.16 would be the safer choice to go.

If you are compiling your own server, be sure to keep the dependencies used also up to date.

--
Regards,
Stefan Hett

Reply via email to