A good Git workflow or how to correctly rebase your PR
Git is the tool when it comes to collaborative work. GitHub with its nice UI and its mechanism of forking and pull requests can boost a team's productivity even more. Unfortunately, there isn't just one way to get things done with Git and GitHub. Here I present an approach to working with pull requests that has proven to cause little friction if done correctly.
The essence of this post is summarised in an asciinema screen cast. If you want some more detail or just prefer reading, continue below the video.
The Setup
For the sake of the exercise, let us say that Alice has started a project and put it on GitHub. Bob wants to contribute while not getting in Alice's way. We can emulate GitHub repositories locally by creating a bare repository for Alice.
Alice then clones her own repo and adds some content
Enter Bob. We emulate the fork functionality of GitHub by making a copy of the “remote” repository and cloning it into a “local” repository that Bob can work with. We also define Alice's remote repository as a remote resource for Bob's repository.
Ideally
In a perfect world, Bob makes his changes locally and rebases his local repository onto Alice's remote repository before pushing to his remote repository. Let's say, Alice has made the following change:
In parallel, Bob makes a similar change
Before pushing his changes he fetches the current status of Alice's remote repository and rebases his branch onto hers.
Git will then complain about a merge conflict. The important thing here is that it is now Bob's responsible to fix the conflict and not burden Alice with it. He proceeds as follows:
Et voilà, his remote repository has a clean history that Alice can merge fast-forward style. On GitHub, Bob would create a pull request and Alice would merge it with one click of a button. In our example, we emulate this by doing
Look ma, no errors! Awesome!
Realistically
Unfortunately, we don't live in an ideal world. In fact, most of the time we will work with more than two people in parallel and they will contribute in a much more chaotic fashion than in this somewhat contrived example.
What happens if Bob commits his changes without rebasing before? Or maybe he did rebase but Alice made some changes before integrating his commits. In both cases we are left with the problem that bob has to rebase his remote repository. The following commands emulate this
Now Bob creates another pull requests but Alice tells him that she can't merge it automatically. He goes back to his local repository and rebases again
# fix merge conflict etc…
But now Git tells him that he can't push because his branch is behind origin/master
. What happened is that when he integrated Alice's commit, the hash of his commit changed and Git doesn't recognise that there are two commits with different hashes that contain the same changes. At this point he could pull from his remote again (using --rebase
to avoid creating another commit for the merge conflict) and resolve the same merge conflict again. Finally git will allow him to push again but his history will contain duplicate commits. :(
There is, however, another way. Though discouraged in every other scenario, git push -f
is the only way forward in this case. This effectively overrides the history of the remote repository and should therefore be handled with extreme care.
Once Bob has ensured that he is in the right folder and on the right branch he proceeds to type
And Alice once again sees a nice and clean pull request that can be merged automatically.
Acknowledgements
Tim made me realise that I had a wrong impression when I thought that I could somehow rebase my pull request without using git push -f
. He inspired me to look into this in detail and produce this post.
Questions, Comments, …
As usual, if you have questions or if I got something wrong, go to the corresponding issue on GitHub, in order to discuss this article.