Jason Merrill <ja...@redhat.com>: > First, thanks a lot for the offer of help; I'm happy to take you up on it > rather than do it all myself.
One important and messy part is just winding up - assembling a contributor map so we'll have proper DVCS IDs everywhere. > On 08/24/2015 12:54 PM, Joseph Myers wrote: > >FWIW, Jason's own trial conversion with reposurgeon got up to at least > >45GB memory consumption on a 32GB repository. > > It ended up being about 65GB. Fortunately I regularly use a machine with > 128GB, so that isn't a big deal. And the trial conversion took less than a > day; I didn't get an exact time. I'm waiting for an additional 32GB to get here so my poor memory-limited Great Beast can cope (and when did you last hear *that* said about a machine with 32GB of RAM?) Note: Purchase will be funded by the first month's pledges from my rather-more- successful-than-expected Patreon page, https://www.patreon.com/esr > I'd like to use the --legacy flag so that old references to SVN commits are > easier to look up. Your call, but ... I don't recommend it. It's very cluttery, and I've found the demand for that kind of lookup tends to drop off after conversion faster than people expect it will. > With respect to Joseph's point about periodic deletion and re-creation of > branches, it looks like reposurgeon dutifully models them as deletion and > re-creation of the entire tree, which is understandable but not ideal. It > also warns about these with, e.g., > > reposurgeon: mid-branch deleteall on refs/heads/master at <184996>. > > Looking over the instances of this warning, it seems that in most cases it > was branch maintainers deciding to blow away the entire branch and start > over because svn mergeinfo had gotten too confused. I think in all of these > cases the right thing is to pretend that the delete/recreate never happened. Perhaps, but there be dragons here. Without those deletealls you could easily end up with incorrect head-revision content. Before you try anything clever here, examine the final repo state to see whther it looks like "the delete/recreate never happened" - it very well might. > Unfortunately, it looks like reposurgeon doesn't deal with gcc SVN's > subdirectory branches any better than git-svn. It does give a diagnostic > about them: > > reposurgeon: branch links detected by file ops only: branches/suse/ > branches/apple/ branches/st/ branches/gcj/ branches/csl/ branches/google/ > branches/linaro/ branches/redhat/ branches/ARM/ tags/ix86/ branches/ubuntu/ > branches/ix86/ > > though this is an incomplete list. There are also also branches/ibm, > branches/dead, tags/apple, tags/redhat, tags/csl, and tags/ubuntu. > > Ideally the conversion tool would just recognize that these are > subdirectories containing branches rather than branches themselves. Neither > git-svn nor reposurgeon currently do that, they both just treat them as one > big branch. This is easy enough to fix after the fact with git > filter-branch: > > https://gcc.gnu.org/wiki/GitMirror#Subdirectory_branches > > but you might want to improve reposurgeon to handle this pattern directly. Look closely at branchify_map. I think we may be able to use it to get the effect you want. Is 'jason' your preferred username everywhere? I'll set up write access to the conversion-machinery repo for you if you like. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a>