On May 18, 11:34 pm, Anssi Kääriäinen <anssi.kaariai...@thl.fi> wrote:
> > You are making a lot of assumptions about the branches someone might send
> > via pull request. It is entirely valid, and often _preferable_ that someone 
> > is
> > depending on these. Maybe they were running their site on this branch,
> > developing
> > the feature to support something they needed, or simply as a way to test it
> > in the wild. Maybe someone else based their own work on an existing topic
> > branch.
>
> > Also, we can't assume that merging back to upstream is the only reason 
> > someone
> > may have created this branch.
>
> I am talking about topic branches here. If you have a branch of "admin-
> next" for example then the workflow might need to be different.
>
> It seems the correct way to describe a Github topic branches is that
> they are just another way of publishing a _patch series_. In Linux
> this is done by sending the patches to LKML. We used to do this by
> Trac patches (still accepted) and now by pull requests in Github. The
> confusion comes from the note that "public branches should never be
> rebased". The topic branches should not be viewed as public branches,
> just as a way to publish patch series.
>
> I am not sure what to do with a pull request of a branch a site is
> depending upon. If the pull is not accepted into Django, how is the
> depended-upon branch supposed to continue and be acceptable for
> further pull requests into Django? If you merge and resolve conflicts
> (by a competing approach maybe), then you will have such a messy
> history it will not be pulled into Django. If you rebase, everybody
> depending on the branch will need to do the same. If you revert the
> rejected patch, then merge, you are again having a history which
> should not be merged into Django. Am I missing something important
> here?

I have done some more research. This post is somewhat long, so if you
want just the conclusions, it is enough to read the last three
paragraphs.

First, some useful links:
http://kerneltrap.org/Linux/Git_Management
http://sethrobertson.github.com/GitBestPractices/#workflow

>From the above, and elsewhere in the net, it is pretty easy to see
that getting the "sausage making" (see GitBestPractices link above)
steps into the core repository history is not wanted. Hiding sausage
making is just a way to say we want good logical commits, but we do
not want back-and-forth development commits into the history. Getting
the development commits into the history makes the history messier,
and it makes bisecting harder. Seeing a "final whitespace cleanup" as
the commit in git blame isn't good...

In Linux the workflow is as follows: topic branches are dealt in patch
format alone. This means that there are no public topic branches, and
this again means the patch author is free to rewrite the history at
will - and that the (subsystem) committer is free to do additional
edits to the patches when committing (I don't know if they do this -
but they could). The patches sent to the mailing lists will at least
sometimes need to go through a couple of cycles of review - tune, so
additional changes to the topic branches are usually required.

When the patch is good to go, the subsystem maintainer commits it.
After the commit, the patch is essentially part of the history of
Linux kernel already. Linus may reject it, or ask for some more
polishing, but he usually does not do this. The subsystem maintainer
is really the maintainer of that part of the kernel, and is officially
so. If for some reason revert is needed, the subsystem maintainer
simply reverts the patch, and Linux master will get the commit +
revert commit of the rejected patch.

The Git team uses a different workflow: they have public branches they
reset once in a while. The 'pu' (proposed updates) branch mentioned in
these links: 
http://www.kernel.org/pub/software/scm/git/docs/v1.7.2.2/gitworkflows.html,
http://stackoverflow.com/a/5713627. So, even the Git developers do
rebase / reset public branches. We do not have a "pu" branch, but one
might be a good idea in the future.

In the http://kerneltrap.org/Linux/Git_Management link Linus argues
that public branches should never be rebased - and his take on this is
very strict, this would apply to any branch that is potentially
public, and thus to github topic branches too. This would cause a
messy history to appear in the commit history of the topic branches,
and as mentioned above this is not a good idea.

For the idea of making pull requests from a branch which a site is
based on the answer is: sorry, it does not work that way. This turns
the dependency around - the branch the site is dependent on should not
dominate Django's development. Only official subsystem branches have
the property that their history is part of Django's history. The
possible tunings/reverts in the site-based branch have no business of
getting into Django's history. What the pull requester should do is
cherry pick the commits he/she wants to pull into Django, create
another branch for pull, rework the commit history into nice logical
commits and ask for pull from that branch.

It is a good question who should do the squashing - the patch author
continuously, the patch author when the patch has passed reviews, or
the committer. It seems that more often than not the committer will
need to do some final cleanup. Requiring 100% ready pull requests all
the time seems like waste of both committer and patch submitter time.
For almost all patches the squash merge workflow works well, it is
only the largest feature patches where this will lead to too large
commits. Still, to me it seems that we should encourage developers to
aim for 100% ready pull requests in the hope that some contributors
are willing to learn to create such pulls.

Everything I have found suggests that rebased topic branches are a
good idea, and these branches should hide the internal development
stages from public view. If they need to be held non-public just to
make sure they are not a base for other peoples work, then it means we
need to deal with patches instead of github branches and pull
requests. I think that is a bad trade-off. However, there are good
reasons to not rebase the work continuously (collaborative effort for
example).

So, lets aim for a compromise solution. We do not require constant
rebasing or keeping the patch as a series of clean commits. It should
(note: no must here) be series of clean commits when the pull request
is initiated. This makes initial review easier. Then, based on reviews
the branch will get some cleanup commits. At this stage we do not
encourage rebasing or rewriting the history of the work. It is OK to
do merge of master if absolutely needed (but usually this is not
needed, even if there are conflicting changes in the master branch),
and it is OK to get a messy history. When the reviews are done, the
history should be rewritten so that the patch is again a series of
clean commits. This can be done by the patch author, or by the
committer. The easiest way to do this is by squash merge. At this
stage the final merge conflicts are resolved, and the patch series is
turned into a nice clean commit history again.

The "rewrite history only at final stage" workflow allows for easy
collaborative effort, yet it provides a nice clean history into Django
master. It allows the reviewer to make a pull request of spotted
errors. For at least 9/10 of the patches a squash merge is enough. The
catch is that if the patch author doesn't do the final squashing, then
she/he will not end up as the "Author" in the commit logs. If the
changes are clean and small enough in the review stage, then merging
the branch directly can be done. Retaining the reviewer requested
changes as a separate (single) commit might be a good idea actually.
So, this is my current suggestion: The rebase workflow, but with
minimal rebasing. The topic branches are usable as a base for work,
but not in the long term.

The above of course needs to be turned into nice easy to follow
guidelines. We need as-easy-as-possible path for trivial changes. As
suggested upthread, we need to make contributing to Django easy - but
it is also a good idea to ask more from those contributors who can and
have will to do more work. Unfortunately I do not have time to write
the guidelines this week, but it is of course OK to make pull requests
into https://github.com/akaariai/django/compare/django_git_guidelines
- I will not rebase the branch (until submitting it for final review)
- it will be using the above guidelines hereupon.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to