top | item 7948953

Merge Pull Request Considered Harmful

271 points| watson | 11 years ago |blog.spreedly.com | reply

111 comments

order
[+] drunken_thor|11 years ago|reply
I really do not see the problem with the rails commit history. Having those merge messages that point back to a pull request leaves lots of documentation about what was done and why it was done. I really do not know why people are so particular about their git log either. It shows an accurate history of the repo, not a revised cleaned up history.

However being able to edit pull request before merging was a good thing to learn. However I think in the long run I would want to learn it with plain old git rather than tack on another tool to my workflow.

[+] ntalbott|11 years ago|reply
So to be clear, every commit in the ActiveMerchant history also points back at the initiating PR via the commit message including the PR number (which gets linked up). So there's no real loss of information in that sense

And I hear your RE adding another tool, as I'm very cautious about that myself. That said, `git am` is "git cannon", and all hub is really adding is the ability for it to slurp Github urls as well as Maildirs.

Neat, related trick: add ".patch" to the end of any PR url on Github and you'll see a formatted patch for the PR. All hub is really doing is slurping that and passing it to git as though it came from a mailing list.

[+] livingparadox|11 years ago|reply
Since the author of the article decided to invoke Linus, I thought I'd see what Linus thought about the merge vs rebase debate.

http://www.mail-archive.com/[email protected]/...

Turns out Linus also agrees with drunken_thor, that merge messages are useful. Suppose that's why Linus added merges in the first place. ;)

[+] agmcleod|11 years ago|reply
Yeah i feel the same way. Never been one to squash my commits, even when i commit something stupid and have to revert it. We have three owners and one collaborator on a javascript project right now, we've been good on keeping up with pull requests, and closing off bugs. I can see where a bigger project like ActiveMerchant could get nuts if only one or two people are handling it.
[+] ForHackernews|11 years ago|reply
Yeah, I found the notion of a commit being "history worthy" kind of silly. If that's how it happened, then it's history! It's not a value judgement.
[+] MaikuMori|11 years ago|reply
Is this exactly situation that https://help.github.com/articles/checking-out-pull-requests-... documents?

So basically:

    git fetch origin pull/ID/head:BRANCHNAME
    git checkout BRANCHNAME
And now you can edit that pull request and/or make new pull request based on the previous one. Or just simply merge in master.
[+] watson|11 years ago|reply
This also eliminates the use of the "Merge pull request" button which removes all the pull-request-merged-commits.

I don't know about you guys, but I have not found them useful? Do you use the merge commits for anything?

[+] andrewchilds|11 years ago|reply
The problem is that pull requests are one-way, read-only collaborations. If I want to make changes, I have to close the contributor's PR and open my own. With write access, I could do:

    git fetch origin pull/ID/head:BRANCHNAME
    git checkout BRANCHNAME
    // make changes
    git push origin pull/ID/head
[+] ntalbott|11 years ago|reply
OP author here...

The result is the same, but I've found the `git am` workflow to be much smoother vs. mucking around with remotes. Some of that could be due to the nature of the OSS project I work on - ActiveMerchant - since just in the last month we've had 30+ unique contributors, and most of them have contributed a single change.

My general recommendation is to make sure you try out the `git am` flow for a bit, but then just do what works best for you.

[+] shadowmint|11 years ago|reply
Hm... this seems like a very complicated way of saying that github should have a way to merge pull requests into a new branch.

...but it doesn't so you have to:

    - checkout a local copy
    - add a remote to the PR
    - checkout a new branch
    - merge the PR into your local branch
    - fix code, merge to master
Which is entirely true; it is annoying.

The simple solution, though, is to require pull requests to come in a feature branch, and flat out reject any that target master. /shrug

[+] eric_bullington|11 years ago|reply
>The simple solution, though, is to require pull requests to come in a feature branch, and flat out reject any that target master. /shrug

Agreed, and in my experience, every major open source project I'm familiar with requires pull requests to feature branches. In fact, most small projects use the same workflow.

Is this not the case with most open source projects?

[+] ntalbott|11 years ago|reply
Using git remotes and distinct branches per contribution is totally legit, and I still do it sometimes. But, I'm often/usually dealing with contributions that have already been reviewed and don't need a whole feature branch/review within the root repository before inclusion. And for that setup - where it's almost ready - it's so much easier to just `git am` it into master, make necessary tweaks, and push.

In general I'd just encourage maintainers to try the `git am` flow, especially on small to medium complexity contributions where it looks mostly ready to go and you don't want to do another week of ping-pong with the contributor just to get a variable renamed or some whitespace fixed.

As I tell my kids, "Just try one bite of <food I find delicious>. If you don't like it, that's cool, then it's more for me!" :-)

[+] russelluresti|11 years ago|reply
Agreed. Having a pull request go to a feature branch (or bug branch, or whatever your pull request is aiming to do) is the best way to go about it.

The issue described by the author is definitely more of a workflow problem than a technology or service problem.

[+] thegeomaster|11 years ago|reply
Torvalds himself doesn't like GitHub's pull request workflow for several reasons and doesn't accept pull requests on GH for the Linux kernel, see [1].

[1]: https://github.com/torvalds/linux/pull/17#issuecomment-56546...

[+] watwut|11 years ago|reply
He seems to be complaining primary about quality of commit messages, missing emails, commit sign offs an things like that. It is more about bureaucracy (which is arguably important on project if linux size) then about workflow itself.
[+] owenversteeg|11 years ago|reply
You know, when I started reading this I thought that I would disagree with you and write a grumpy comment to that effect. ("Considered Harmful" usually makes me grumpy.) However, you convinced me otherwise. Congratulations on a good "Considered Harmful" piece!

Here's another example: I recently made a PR to a project to fix a broken URL. I changed a wrong file and forgot about the PR. The maintainer had to close the PR, make the change himself, and explain why/what he did. This whole process would've been a lot simpler if Github allowed people to merge to another branch.

[+] IanCal|11 years ago|reply
> This whole process would've been a lot simpler if Github allowed people to merge to another branch.

Is this a problem that comes from forking? I raise PRs on my projects onto various different branches all the time. Or maybe I've misunderstood the issue.

[+] watson|11 years ago|reply
I just tried this on an open pull request we had. Pretty ok experience. For some reason though, there where a merge conflict when running the "hub am -3 <url>" command. In my case was easy to fix, but Github reported the PR to be mergeable, so maybe Github uses a different merge strategy for PR's than "hub am" out of the box?
[+] davedx|11 years ago|reply
Yeah, I don't get this hub thing. When I want to manually merge on a Github project I just follow the GH instructions on the PR page to manually checkout the fork, do my work there, and merge it locally then push. Why do you need another tool to do this?
[+] ntalbott|11 years ago|reply
As I understand it `git am` is actually replaying the series of commits on top of the branch you run it in, whereas a merge commit flattens the commits out and applies them in one go (while keeping full history of the commits begin merged). So it's definitely a different strategy, and you can get conflicts where you wouldn't with a merge commit. As you said, though, they're super easy to fix when they happen. Often when I see a conflict like this it's because there are 52 (OK, exaggerating, a little) commits in the contribution and using the `git am` flow is a huge win since I'm now in a good spot to squash the down to one or two commits.
[+] watson|11 years ago|reply
Just a note to people who want to try this: According to the post you should use the command "git am -3 <url>", this seems to be a typo and should be: "hub am -3 <url>"
[+] ytjohn|11 years ago|reply
Author didn't mention it, but in the hub documents they instruct you to `alias git=hub`. So when author is running git, he's really running hub.
[+] nirvdrum|11 years ago|reply
I'm not sure how I feel about someone writing something and making me the author. I guess attribution makes sense in the form of paraphrasing. But I tend to think of commits as literal quotations.
[+] jessaustin|11 years ago|reply
I see what you mean, and it almost seems like the "sin" isn't the additional editing, it's the subsequent rebase. It's one thing to require contributors to rebase, but it seems like another to rebase for them. Maybe I'm missing something, but based on the "History Is Written By the Victors" section of TFA I would have been surprised to see anyone other than @ntalbott at the top of https://github.com/Shopify/active_merchant/graphs/contributo...
[+] ntalbott|11 years ago|reply
You may think of commits as literal quotations of the "Author", but it's important to realize that git doesn't. If you want to know who's ultimately responsible for a commit, you always want to look at the "Committer" since that's who actually signed on the dotted line so to speak.
[+] joshribakoff|11 years ago|reply
I've always just pulled in the contributor's branch, made additional commits to fix things up, and squashed all the commits before pushing up to GitHub. Not very difficult
[+] gleenn|11 years ago|reply
Interesting problem for committers. Since Github made the hub tool, maybe they could make this person's flow better right from the web UI somehow.
[+] ntalbott|11 years ago|reply
Maybe, but... probably not. The thing is, I want to work with the changes locally before they get included. So it can't be a 100% web workflow. That said, there might be ways the web UI could make the transition to the CLI easier, and it would be great if Github promoted the `git am` flow on the PR pages as well.
[+] reledi|11 years ago|reply
What works best for me is to use a combination of the Pull Request button and to manually merge or rebase changes when necessary.

I used to strictly stay away from the Pull Request button, because it made my history "messy". Now I care less about that and more about the convenience (when it's appropriate).

[+] fra|11 years ago|reply
I wish github would either detect when a branch is merged back into master and automatically close the PR, or allow for some more complex merge operation. Squash merge would be particularly helpful.
[+] Siecje|11 years ago|reply
You should be able to update your fork to be the same as upstream, with one click on your fork.
[+] ricket|11 years ago|reply
"And as a hypothetical story-telling construct I created in my mind, Jane agrees 100%!"

That's a very strange way to end an article. Could have just stopped before adding that last line.

[+] aut0mat0n1c|11 years ago|reply
Using the term 'considered harmful' in a blog post title considered harmful.
[+] SimeVidas|11 years ago|reply
I love the approach but hub seems to be Mac only :(
[+] nfm|11 years ago|reply
It's just a ruby script, and it works on Linux/Mac/Windows.
[+] warfangle|11 years ago|reply
It doesn't build properly on Linux?
[+] leccine|11 years ago|reply
"If the contribution is a single commit, I’ll do a git commit --amend and just smoosh my changes right into the original commit."

that is right, editing history is the way to go, or is it? :)

[+] grogenaut|11 years ago|reply
Considered Harmful blog posts Considered Harmful
[+] ntalbott|11 years ago|reply
You forgot to reference the great recursive essay Eric Meyer wrote on exactly this topic:

http://meyerweb.com/eric/comment/chech.html

Which essay I already knew about and chose to willfully ignore when I titled the OP. We could chat over beers sometime as to whether I'm a bad person for doing so :-)

[+] CtrlAltDel|11 years ago|reply
We used to be elitist about our choice of language. We've evolved to become elitist about how we use source control.
[+] thinkt4nk|11 years ago|reply
harmful?
[+] rjbwork|11 years ago|reply
It's a play on the classic Dijkstra letter, "Goto Considered Harmful." Also yes, I think it's harmful. A co-worker and I wanted to use a new up and coming FOSS project that's hosted on GitHub, but we needed a certain killer feature. Said co-worker worked for about 4 weekends in a row, and fully implemented it, with nice abstraction and separation of concerns. The code was then turned down because it was "too seperated, and unnessecarily abstracted" which we disagreed with. Thus the pull request, with an amazing feature, sits languishing because he nor I is going to spend another few weekends working on a project that won't integrate new features unless they're perfect and conform to the original author's idea of "perfect code".