[neomutt-devel] GitHub Merging and Squashing

Tue Feb 7 15:01:23 CET 2017

Before discussing the merge and squash strategies on github, I think a
small clarification on how to and what for build your commit history is
needed. 

We often give the rule of "make your commits atomic" as a rationale when
writing commits. But it's easy to make the commits more atomic than
necessary. So the rationale to keep is to think of how the commit
history will be used once you're done working on it. And here are the
common use cases:

1. read the history to keep up with changes ;
2. bisect the history to find when a regression has been introduced ;
3. help with merging or rebasing

On Tue, Feb 07, 2017 at 12:42:15PM +0000, Richard Russon wrote:
[…]
> When to merge?  When to squash?

when working on a PR, you can have two strategies:

## Commit as you go

the most straightforward and most common one is to commit as you go: 

```
do {
  make changes;
  compile;
  test;
  commit;
  push;
} while (no more changes is needed or requested by review team);
```

then you end up with a commit log in the PR that might look like:

```
adds feature F with behaviour B
refactors A to make it simpler
adds documentation and doxygen
fixes race condition
fixes typo in doxygen
adds related feature F' to cope with behaviour B'
fixes edge case handling
updates documentation
```

So this commit history is featuring a lot of very small commits with
very little information, so if we introduce a lot of those:

1. it's making more painful to read the history (lots of noise surrounding the signal),
2. it's going to make bisects take more steps and
3. it's not of any added value for the merges.

That's when the maintainer will want to follow through the squash+merge
strategy. But then the downside would be that the information of the two
related features is squashed into one:

```
adds features F and F' to cope with B and B'
```

meaning the maintainer would have to do fetch the branch, squash twice
manually and only then merge, so it would become:

```
adds feature F with behaviour B
adds related feature F' to cope with behaviour B'
```

## Craft your branch's history

The other strategy, has my preference, it's to "craft" history
as you're working the code. 

When working on a feature (or PR) branch, history is not yet fixed. You
can craft it, rewrite it, as much as you like to give find the right
spot.

Instead of having your commits show the order in which you wrote your
code, you squash as you go:

```
make changes
compile
test
commit
do {
  make changes
  compile
  test
  if (changes related to last commit) {
    commit --amend
  else {
    commit (in a temporary commit)
    rebase -i HEAD~<number of commits you want to see>
    fixup in the older commit
  }
while (no more changes is needed or requested by review team)
```

Then the commit history of the feature branch would look like:

```
adds feature F with behaviour B
adds related feature F' to cope with behaviour B'
```

which would contain atomic commits, just big enough to not clutter the
commit history, without making bisect or merges painful.

Then those can be merged as is, no squashes needed by the maintainer.

The downside of working that way is that by rewriting history, you can
then lose track of the discussion of why a given change happened (at
reviewing time). But hopefully the github PR feature keeps track of all
the commits that were discarded and replaced with the new history, along
with the commits that led to them.

Actually, this is a strategy that's encouraged by reviewing tools like
gerrit, rietveld or phabricator (as well as with github and gitlab PRs).
Of course, it needs the contributor to have a bit of confidence in his
git-fu than with the other one.

Also, when you think about it, this is pretty close to what's being done
when one is submitting patches by mail to a project, as when being
reviewed that way, the contributor resends updated patches, not extra
commits in follow up mails.

> When work is merged into master:
> If there are 1 or 2 commits, then fast-forward master:
>   `merge --ff`
> 
> If there are more than 2 commits, then rebase and merge into master:
>   `merge --no-ff`

Ideally, all branches should be rebased by the branch author to always
stay based on the master branch so the maintainer's work is made easier,
and the merge could be fast forwarded (i.e. no merge commit).

But practically, there are cases where the master has had many little
changes across the code base and doing a merge commit would make it
easier to address all those little changes at once instead of replaying
them through each commit of the PR.

Cheers,

-- 
Guyzmo