2009/4/11 Stéfan van der Walt <ste...@sun.ac.za>: > From my POV, the current system is very unproductive and, while > git-svn makes life a bit easier, it comes with its own set of > headaches. Especially now that we are evaluating different > work-flows, we need the right kind of vcs to back it up.
Please take the following with a big grain of salt, because I haven't used either git nor hg beyond trivial cloning. But I do have by now pretty extensive experience with bzr (ipython and nipy), so I think I've developed a decent intuition on what's important/useful in the long run from DVCS workflows. I have also been reading a fair amount about git and hg, in particular about the core of their internal models. This, it seems to me, is actually far more important in the long run than their top-layer polish. >From that perspective, right now my intuition (and yes, it's only that for now) tells me that git has some really appealing features for a dvcs, which as far as I can see are not present in hg (seeing as unless I've grossly misunderstood it, it is much closer to bzr in its internal model than to git). To me the key point in git of fundamental value is its direct manipulation of the commit DAG and history: this is something that I think one only comes to appreciate after using a DVCS for *a reasonably long time* on a *reasonably complex project* with multiple developers, branches and merges. I stress this because I think these points really only become apparent under such conditions, at least I didn't really think of these things until I used bzr extensively for ipython. Let me elaborate a bit. One of the main benefits of a DVCS is that it enables all developers to be aggressive locally, to experiment on crazy ideas and to use the VCS as their safety line in their experimentation. You are free to try crazy things, commit as often and finely-grained as you want, and if things go wrong, you can backtrack easily. But in general what happens is that things don't simply go wrong: you often end up making things work, it's just that the intermediate history can look totally crazy, with tons of intermediate commits that are really of no interest anymore to anyone. With git, there is a way of saying "merge all this into a single commit (or a few)" so that it goes into the upstream project into chunks that make logical sense and not just that reflect your tiptoeing during a tricky part of the development. In bzr (and as far as I see, also in hg), this kind of history rewriting is near impossible, so the best you can do is make a merge commit and leave all that history in there, visible in the 'second level' log (indented in the text view). As a project grows many developers, having all this history merged back into the main project tree gets unwieldy. >From my (now reasonably extensive) experience with bzr, it really feels like a system that took the centralized VCS model and put 'a little svn server everywhere you need one'. That is, the repository/branch/history model retains the rigidity of a centralized VCS, it's just that you can have it anywhere, and it can track branching and merging intelligently. There's certainly a lot of value in that, I am not denying it in the least bit. However, git seems to really make the key conceptual jump of saying: once you have a truly distributed development process, that rigid model just breaks down and should be abandoned. What you need to accept is that the core objects you should manipulate are the atomic change units needed to reconstruct the state of the project, and the connectivity between those units. If you have tools to manipulate said entities, you'll be able to really integrate the work that many people may be doing on the same objects in disconnected ways, back into a single coherent entity. Sorry if this seems a bit in the air, but I've been thinking about this for the past couple of days, and I figured I'd share. I don't mean this to be a bashing of hg or bzr (which we'll continue using for ipython at least for a long while, since now is not the time for yet another workflow change for us). But from *my* perspective, git offers really the correct abstractions to think about distributed collaborative workflows, while the other systems simply seem to offer tools to distribute the workflow of a rigid development history (a la CVS) to multiple developers. There's a fundamental difference between thosee two approaches, and I think it's a critically important one. As for what numpy/scipy should do, I'll leave that to those who actually contribute to the projects :) I just hope that this view is useful to anyone who wants to think about the problem from an angle different to that of specific commands, benchmarks or features :) All the best, f _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion