July 23, 2019
Know what really grinds my gears? A git DAG that looks like a pile of spaghetti.
That's a screenshot from Atlassian Sourcetree for a real repository at my work (I rotated it 90 degrees for better layout in this post). Here's another (much more mild) real example:
* 1d50ef2 O |\ | * d718abb N <-- should have been rebased on 520009a (M) * | 520009a M |\ \ | |/ | * e117172 L <-- should have been rebased on 28c9618 (K) * | 28c9618 K |\ \ | |/ | * b498d5d J <-- should have been rebased on fdde87d (I) * | fdde87d I * | a7afa18 H |\ \ | * | 3dcd450 G | * | b654da1 F | * | 0895247 E |/ / * | 8b7931d D |\ \ | |/ | * b0fb1ae C | * a27228f B |/ * f9959b4 A
This is what you might see when doing
git log --graph --oneline. But
--graph is actually hiding part of the problem. Here's the same tree without
1d50ef2 O d718abb N 520009a M e117172 L 28c9618 K <---+ this commit merged b498d5d (J) to master fdde87d I | a7afa18 H | b498d5d J -----+ 3dcd450 G b654da1 F 0895247 E 8b7931d D b0fb1ae C a27228f B f9959b4 A
Note how the history is not really in order. HOW is it meaningful to say
b498d5d (J) occurred before
a7afa18 (H) and
fdde87d (I) if it was
merged after? If you were trying to figure out what to revert, this would be
at least a little bit misleading or confusing. This is just one problem caused
by allowing Spaghetti Merges. I'm a bit anal retentive about this, but
keeping a clean git history really does help when tracking down bugs, etc.
Another thing to consider is isolated testing. Whether testing some changes
manually or running CI on a pull request, there is some significance to the
current branch base at the time when the testing is done. In the example above,
b498d5d (J) was tested with a branch base of
b0fb1ae (C). That means that
3dcd450 (G), and
fdde87d (I) were
opportunities for semantic conflicts (the other commits are just merges).
The bors project was created to prevent semantic conflicts. This illustration shows an example of how isolated testing and an unlucky merge can lead to a broken build on master. On large projects, it can be nigh impossible to successfully enforce manually rebasing before every merge. The chances are simply too high that another merge will have occurred between your most recent rebase and the completion of automated or manual tests afterwards. This is why bors is invaluable on large projects (such as servo).
It's a little unfortunate that bors is specifically designed for Github, since many enterprise software engineers use Bitbucket or a self-hosted git server, and Gitlab has been growing in popularity. I'll be keeping my eye out for tools that can accomplish the same thing without requiring Github. At my previous job we typically did a pretty good job of rebasing before merging. Since we had a rather time consuming CI pipeline, we implemented a merge queue for preventing that frustrating cycle of rebase + retest. You only have to rebase once: when it's your turn to merge.
I think we can all agree that it's very nice to have a clean history:
__ooooooooo__ oOOOOOOOOOOOOOOOOOOOOOo oOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo oOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo * ef533e1 oOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo |\ oOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo | * a8236d5 oOOOOOOOOOOO* *OOOOOOOOOOOOOO* *OOOOOOOOOOOOo | * 8c5f2ca oOOOOOOOOOOO OOOOOOOOOOOO OOOOOOOOOOOOo | * f769333 oOOOOOOOOOOOOo oOOOOOOOOOOOOOOo oOOOOOOOOOOOOOo | * 974ea79 oOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo | * f5e751b oOOOO OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO OOOOo |/ oOOOOOO OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO OOOOOOo * 78433d7 *OOOOO OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO OOOOO* |\ *OOOOOO *OOOOOOOOOOOOOOOOOOOOOOOOOOOOO* OOOOOO* | * 54946cb *OOOOOO *OOOOOOOOOOOOOOOOOOOOOOOOOOO* OOOOOO* |/ *OOOOOOo *OOOOOOOOOOOOOOOOOOOOOOO* oOOOOOO* * 599af99 *OOOOOOOo *OOOOOOOOOOOOOOOOO* oOOOOOOO* *OOOOOOOOo *OOOOOOOOOOO* oOOOOOOOO* *OOOOOOOOo oOOOOOOOO* *OOOOOOOOOOOOOOOOOOOOO* ""ooooooooo""
Let's do what we can to make that happen.