Hi,
For backup purposes I keep a mirror of my svn repo. The mirror is
modified only by "svnsync", which runs hourly in a cron job.
In order to validate the mirror, I run an "svnadmin dump" on the mirror
and on the original, and assert that their md5sums are the same.
I am finding that in a few of the revisions in my history, in which a set
of files are deleted, the svndumps on the original & mirrored repos will
list the files in different orders, which of course makes the md5sums
different even though the repos appear to be in the same state.
Both the original & mirror are fsfs-format, and I'm using subversion-1.7.1
on both sides. (I just checked that subversion-1.7.5 does the same
thing.) The apr and other libs are likely different versions, though.
I tried to create a small test repo that demonstrates this behavior, but I
haven't been able to trigger it. Argh. I've been running this backup
approach for a long time and never seen this before, but it does show up
in a few revs in my repo. (The repo is available at
http://astrometry.net/svn, and rev 20053 shows this behavior, FWIW)
I added some debugging output to subversion/libsvn_delta/path_driver.c :
svn_delta_path_driver() and it does visit the deleted files in the same
order, but my guess is that since the deleted files get added to a hash
(subversion/libsvn_repos/dump.c : delete_entry(), pb->deleted_entries) and
then the hash gets iterated over later, in close_directory()), maybe the
order of hashkeys isn't defined, so the order they actually get written
out can vary. But I've spent a total of maybe 15 minutes working with the
subversionn/apr code so your guess is better than mine.
Suggestions on how to proceed would be appreciated. My first guess would
be to sort the deleted entries in close_directory() before writing them
out, or use a list-like rather than hash-like data structure to store the
delete entries.
cheers,
dustin
Background: the dump is something like:
${SVNADMIN} dump -q --incremental -r 20000:HEAD ${MIRROR} | \
grep -v Text-copy-source-md5 | \
md5sum -
And I do the same on the remote side via ssh. The 20000 is there to make
it run faster; I keep archives of the svndumps up to 20k.
(This does mean that if there is a change to the original repo between the
svnsync and the svndump, the md5sums will come out different. This is a
low-traffic repo so I actually like the occasional false alarm: if your
smoke alarm goes off when you burn toast, at least you know it still
works.)
SVN-fs-dump-format-version: 2
UUID: 03a3cea6-2c03-0410-ac9b-9271b1c66c29
Revision-number: 20053
Prop-content-length: 139
Content-length: 139
K 10
svn:author
V 8
vivitsal
K 8
svn:date
V 27
2011-12-21T21:17:15.067006Z
K 7
svn:log
V 36
archetypes: version submitted to ApJ
PROPS-END
...
Node-path: trunk/documents/papers/archetypes/paper_plots/36ggSLACS.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/66ggSLACS.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/1qq.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/2qq.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/3qq.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/4qq.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/36gg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/12ggg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/66gg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/18lg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/55lg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/1gg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/2gg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/3gg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/103lg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/59lgg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/68lgg.ps
Node-action: delete
SVN-fs-dump-format-version: 2
UUID: 03a3cea6-2c03-0410-ac9b-9271b1c66c29
Revision-number: 20053
Prop-content-length: 139
Content-length: 139
K 10
svn:author
V 8
vivitsal
K 8
svn:date
V 27
2011-12-21T21:17:15.067006Z
K 7
svn:log
V 36
archetypes: version submitted to ApJ
PROPS-END
...
Node-path: trunk/documents/papers/archetypes/paper_plots/1qq.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/2qq.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/3qq.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/4qq.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/36gg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/12ggg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/66gg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/18lg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/55lg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/1gg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/2gg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/3gg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/103lg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/59lgg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/68lgg.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/36ggSLACS.ps
Node-action: delete
Node-path: trunk/documents/papers/archetypes/paper_plots/66ggSLACS.ps
Node-action: delete