Hi, I just wanted to say that I finally managed to get the dump-filter-load cycle done and deploy the filtered repo. It did get rid of about 90% of the repository size so that's good for us. A big thanks to all who helped out with information in this mail thread! I would definitely have stranded somewhere without you.
One thing that was a bit annoying was when the dumpfilter threw an error because of a source of a file was missing when I filtered out a certain path and it turned out it had been copied to another location. The error message only prints out the missing source and not the destination, so I had to go into the repo to check the revision it crashed on to find the copy destination and add it to my filter list. Would have been nice if the error message could list both the source and the destination. /Chris -------------------------------------------- On Wed, 10/10/18, Johan Corveleyn <jcor...@gmail.com> wrote: Subject: Re: svndumpfilter and svnsync? To: "Chris" <devnullacco...@yahoo.se> Cc: "Daniel Shahaf" <d...@daniel.shahaf.name>, "Ryan Schmidt" <subversion-2...@ryandesign.com>, "Subversion" <users@subversion.apache.org> Date: Wednesday, October 10, 2018, 12:11 PM On Wed, Oct 10, 2018 at 11:18 AM Chris <devnullacco...@yahoo.se> wrote: ... > >>> The syntax I used: svnadmin dump -q MYREPO | svndumpfilter exclude > >>> --targets filterfile filterdump svnadmin load -q --no-flush-to-disk > >>> --force-uuid -M 2048 --bypass- prop-validation ./NEWREPO < filterdump > >>> > >>> (I had to use the bypass-prop-validation due to some newline issues > > in old log message, similar to this one > > https://groups.google.com/forum/#!topic/subversion_users/P3ohZ-hKhCA, > > don't know why they have wrong newlines, but the repo works as it is > > now...) > >> > >> Instead of ignoring wrong newlines, you could fix them using > >> svndumptool (using its eolfix-revprop command), originally at: > >> > >> http://svn.borg.ch/svndumptool/ > >> > >> Newer fork at: > >> > >> https://github.com/jwiegley/svndumptool > > > > Also, as of version 1.10, svnadmin finally has an option to normalize > > these on-the-fly during 'load': > > http://subversion.apache.org/docs/release-notes/1.10.html#normalize- > > props > > > > It's a lot better to normalize these (either with the > > --normalize-props option for 'svnadmin load' or by using svndumptool) > > than to "bypass" them. Otherwise you'll run into this again later (if > > you would dump+load again sometime in the future). > > I tried --normalize-props and I still got the same error which is why I > switched over to bypass. Maybe I've run into some bug with --normalize-props. > Unfortunately, I don't think I'll be able to create a script for reproducing > the error since it happens far into a monster dump load. > So I'll stick with the bypass for now or try the tool that Ryan suggested. In that case the culprit might be another property than svn:log (or it might be something like "non UTF-8 encoded" but not EOL-related in svn:log). Possibly a "versioned" property like svn:ignore or some other property in the svn: namespace. This is more difficult to fix, but still it might be best to get rid of it or you'll run into it again in the future. See the very last bullet in: http://subversion.apache.org/faq.html#dumpload If that's indeed the problem, then you'll have to use that svndumptool that Ryan pointed you to. Quoting from that last bullet in the FAQ entry above: "This is more difficult to repair, because 'svn:ignore' is not a revision property (unlike svn:log, which can be manipulated with svnadmin setrevprop), but a versioned property (so it's part of history). Again, you can ignore this with --bypass-prop-validation. But since this is a corruption "in history", this can only be repaired with a dump+load, so this might be a good time to try and fix this (or you'll run into this again in the future). To repair it you can use a tool like svndumptool. But it only works on dump files, not as part of a pipe. So a possible way to go about it is: dump that single (corrupt) revision to a file, repair it ('svndumptool.py eolfix-prop svn:ignore svn.dump svn.dump.repaired'), load that single dumpfile, and then continue with a new "piped" command (like step (6) above). " I should note here that svnsync is more powerful in this regard: it does have the ability to normalize all of these on the fly. It's a real pity that 'svnadmin load' doesn't (except for the svn:log EOL fixing). Doesn't *yet* that is, until a volunteer comes along that submits a patch for it ;-). Anyway, I hope you succeed in cleaning this up eventually :-). -- Johan