Werner LEMBERG <w...@gnu.org>: > > For your interest, and so you and the other listmembers can see how > > this was done, I'm enclosing a tarball containing copies of the > > current authormap file (which is what you modified, with five > > entries added and some address removals reverted), the reposurgeon > > lift script, and the Makefile. [...] > > This is indeed amazingly simple. Thanks for your work!
Most of the hard work got done back in 2010 when I saw the possibility implied by the git import stream format, and wrote reposurgeon to exploit it. The result is, as you noticed, an almost ridiculously powerful tool that makes a lot of very hairy operations look simple. The appearance is deceptive; there is a lot going on behind the scenes. I shall explain a bit more, because the conversion is not finished yet and you will need to make some policy choices before we're done. The actual CVS-to-git conversion work was done by cvs-fast-export, which I also maintain. The lift script expresses the edits to do on the history once gitified. Under slightly different conditions it might have looked like this: verbose 1 set canonicalize read . delete :18138 obliterate With this invocation, reposurgeon would have looked at the current directory, seen that it was a CVS repo, looked in its table of import front ends, and called cvs-fast-export itself, parsing the git fast-import stream that it emits. The generated command would have looked like this: find . -name '*,v' -print | cvs-fast-export -k --reposurgeon But there was a default option I wanted to suppress (--reposurgeon), so I ran the front end "by hand" (actually, through a Makefile production) instead. That option generates voluminous data on CVS revision numbers that we don't need in this case because your change comments have no CVS commit references in them to be translated. The script is a little longer now: verbose 1 set canonicalize read groff-raw.fi delete :18138 obliterate # Salvage some multiline comments into git-like form by removing whitespace. # On most this couldn't be done because they mixed topics, # so a summary line would have been misleading. mailbox_in <<EOF ------------------------------------------------------------------------------ Event-Number: 1833 * html.cc (create_tmp_file, create_temp_name): Removed. It has been replaced with calls to xtmpfile() and xtmptemplate(). ------------------------------------------------------------------------------ Event-Number: 1916 [[Several hundred lines of text omitted]] ------------------------------------------------------------------------------ Event-Number: 18144 Fixes to TOC, BIBLIOGRAPHY, and ENDNOTES leading management and traps. EOF prefer git write groff.fi That mailbox_in section batch-modifies commit comments. To make it, I did this: $ reposurgeon "read groff-raw.fi" "mailbox_out =L" >MULTILINE which dumped all the non-git-conformant multiline comments into MULTILINE. I then edited that and, when I was done, pasted it into the lift script. Here is your first policy choice. I am thinking of writing a filter operation that would take all the comments that look like this: [start of comment] [blank line] * One sentence of random stuff [blank line] [end of comment] and delete the leading "* ", leaving [start of comment] [blank line] One sentence of random stuff [blank line] [end of comment] The operation would look, in reposurgeon, like this: transform 1..$ /^\n\* (.*)\n\n$/\n\1\n\n/ The reason for this is that I think it would be good if leading '*' in a gitk list of first lines were a visual warning that the following comment is "old style" - fails to obey git conventions The policy question is: are you OK with me editing the history that much? Some people would not be. It's another level of intrusiveness up from just tweaking whitespace. Also, you should know what I plan to do with the tarballs. I have a tool called git-weave, not yet published, which takes a sequence of tarballs and weaves it into a revision history - one commit per tarball. I can write a little metadata fle to specify commit comments and tags. I'll apply git-weave, then use reposurgeon to graft the tip of the woven prehistory repo to the root. Then I'll check in a commit describing the conversion and including the recipe I used to do it. And that will be it. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a>