This guide is heavy on the mechanical side and misses a lot of important substantive parts, if your goal is to add value to an open source project.
Don't just create a fork, branch, and submit a PR without context. First, make sure the intent of your change is actually desired. Just because someone opened an Issue does not mean that it belongs in the project. Anyone in the world can open a Github Issue for any reason. Instead engage and discuss the Issue first and make sure it's actually something the project wants.
Don't just start writing code. Familiarize yourself with the codebase. This comes naturally if you are a user of the project, as you will naturally run into bugs or learn the software's behaviors and as you discuss the Issue or features with maintainers. There are far fewer right ways to build a feature than possible ways.
Finally, understand that your contribution is not "free" for the project. It takes time and consideration to even look at your PR and even more to code review it. The more popular the project, the more true this is.
There is, however, value in submitting a PR without diving into the often stalemating "discussion" happening in a lot of projects. As much as I wish it weren't the case, I have found over the years that if your code follows conventions, works, and is useful, asking for forgiveness is much, much easier than asking for permission.
> Don't just create a fork, branch, and submit a PR without context. First, make sure the intent of your change is actually desired.
I think all that matters is that the change is something you want. If no one else has any need for it, you can continue using it as long as you wish to maintain your fork.
It's great to collaborate and share ideas before you start working, but if it's something you need, you'll do it even if everyone else says it's pointless.
| Finally, understand that your contribution is not "free" for the project. It takes time and consideration to even look at your PR and even more to code review it. The more popular the project, the more true this is.
Understand code interactions: they scale N^2 with each new feature added. Specifically, each interaction your feature has with all the other features has to be coded, and then each new feature might interact with yours. This is the curse of scope freak.
It is the sole responsibility of the PM/owner of the project to select which features worth this.
There is one very important tip that's missing: Follow the original coding style exactly.
Not just spaces vs tabs or block styles, but idioms and other idiosyncrasies, too. Why? Imagine reading a source repo where every second block uses different bracket styles, mixing spaces with tabs and so on. It's going to look like a kludgy mess, and will be distracting to read.
There is no correct style for most languages (perhaps `go fmt` might be an exception), only opinions.
I agree, and it's one of the reasons why novice-to-intermediate programmers should try to pitch in to a project, even for something very minor. I remember making a pull request to a Ruby project and getting rejected and being told to fix all the rubocop errors, which made me aware of the existence of tools for auto-style-detection/linting, and of best practices in style that greatly improved my programming experience.
That kind of practical thing is not well-covered in tutorials and self-learning curriculums.
I dislike the idea of using 'origin' for my own remote name.
I keep 'origin' as the canonical remote and my local master branch tracks origin/master. I use people's usernames for their remotes (including for my own).
I've seen people at work who are new to git/Github struggle a lot with the 'origin'/'upstream' differentiation recently, especially when they're learning branching, and they don't seem to have any problems once I switch them over to using 'origin' + usernames.
FYI: GitHub will create a branch in your repo for pull requests. You shouldn't need to pull directly from someone else's repo just to look at their pull request.
Same here. Except I use 'myfork' as the name for my remote repo. Origin/upstream always got me confused when I first started learning Git, so I created my own naming conventions.
I just want to emphasize that if reviewers ask you to make changes to your pull request, it is not a rejection or lack of appreciation. As the maintainer of an open source project, I greatly value contributors who will iterate and iterate until the change is accepted, and often, I will give them push rights ("collaborator" status) to pay it forward.
And if a change is rejected, it's usually because there was not enough discussion beforehand about how to solve the problem, the change itself did not undergo enough discussion/iterations, or the change is not really a solution to a problem. (It's not the maintainers saying "Go away and never come back" -- more like, "Thank you for your effort! Please approach this differently.")
If you'd like to avoid switching to the browser in your development workflow, I recommend checking out Git-Repo [0] that was posted to HN a little while ago [1].
Git-Repo basically tries to put as many steps of the contribution workflow in terminal as possible, by using the API of the git hosting services. It currently supports GitHub, GitLab, and Bitbucket.
I am not a programmer. I do copywriting for pay and I have run a bunch of different personal websites over the years (15+ years, I think). I know a little HTML and CSS.
I am interested in getting involved in open source via first working on copywriting and website stuff, since that is what I already have a background in. I am finding it extremely opaque to figure out how on earth to do that.
I left a couple of related comments on HN recently:
For many projects, Github is just a place to publish yet another public repo. Using github issues and pull requests is a sure fire way to feel ignored. If you want to contribute, e-mail the lead maintainer. Do not submit patches to the ether. Do not think anyone will look at your patches. Having started several large open source projects, and started / worked for a number of open source companies, I can tell you the best way to get involved is to work on your personal relationship with the other developers. If that means hanging out in IRC or Slack, that's what it takes. Github is a terrible form of communication, especially when your org / developers have 100+ repos.
If the author of the repo can't be bothered to look at the issue or a PR then I don't see how they'll be receptive to emails.
Most projects don't have other ways of contacting the people involved (very few list their e-mail addresses and even less have dedicated irc or slack channels).
Yes, it happens way too often that issues and PRs are ignored but realistically it's a strong signal to stop investing more time into such projects because it's unlikely things will improve.
Check that there is not a mailing list first. This is where the "read the contributing guide" kicks in.
I for example have quite a lot of filters, that make sure I reply to mailing lists, but if you email me directly, you may end up lumped in with all the commercial "include my $thing in your project please" emails (purely by accident)
While it's true you might want to email the author and all, having a PR is great for one thing: letting other people looking for the bug/feature find the fix before it's upstreamed.
This is what I usually do with my PRs. I first submit the PR so it's there for other people to grab if they need it, and then I'll look into the "proper" way of upstreaming the fix.
This avoids duplication of work and enhances discoverability.
Step 1: stumble upon a terrible bug (or that really obvious missing feature that _should_ be there) in your favorite library / framework / app.
Step 2: rant about it on HN / Github issues / whatever.
Step 3 (optional): try to reach developers on GitHub and get the obligatory "pull requests are welcome" response.
Step 4: In frustration, clone the repository, fix the damn bug and submit your pull request.
Steps 5..41: have an angry and emotional discussion with the devs who refuse to accept your PR because broken binary compatibility / regression tests / coding style / your choice of variable names etc. Fix all these issues and resubmit the PR until it's accepted.
Step 42: And this is how you become a contributor to high-profile projects like Docker, Akka, Spark etc, and now free to boast about it in your CV!
Disclaimer: I am an akka and akka-http community contributor. I don't know how this works for other projects.
But I don't think the process is as painful as you describe in the akka world. My experiences are quite to the contrary. The community here is warm and welcoming. But, please look at this from the other side. Would you use the software in your mission critical application if the project was accepting quickly contributions from random strangers on the internets?
Step 4 requires no frustration. It should be a relief you can fix it yourself rather than hope for some opaque engineering team to maybe fix it at some future date since your company doesn't have the Platinum Support package.
> Steps 5..41: ...
I think maintainers of small projects should do their best to merge whatever is given to them without back and forth or delay--just a prompt and sincere thanks. There's no reason to treat contributors like junior devs you're trying to teach and guide. Sometimes that requires swallowing some pride in having the code be just right or means you may want to fix up a few things after the merge.
> Steps 5..41: have an angry and emotional discussion with the devs who refuse to accept your PR because broken binary compatibility / regression tests / coding style / your choice of variable names etc. Fix all these issues and resubmit the PR until it's accepted.
As a maintainer, I can completely see where you're coming from here and some of nits that I have against PRs probably seem quite trivial. However, if you have a fairly large project then as a maintainer the onus is on us to make sure that things don't break (and that the project remains manageable). Not to mention that the argument people use (that we should just make a follow-up commit that fixes the PR so it's actually acceptable) doesn't _really_ work if you're someone who has to do bisects often -- I'm not going to merge code that is clearly wrong in several places (unless the code is returning a "not implemented" error).
It's not a pride thing, as maintainers we have much less time than the contributors (there are fewer of us than there are of you) so we can't carry every PR that someone drops on our issue tracker.
Of course, for runC we have automated testing that verifies that things like golint and gofmt succeed (so we don't have to waste time going back-and-forth with contributors about it). Lots of smaller projects don't do this, which leads to nothing but annoyances for contributors.
Also, in general when reviewing a change I try to state whether or not the idea is sound. If I am commenting on your choice of variables or how you've structured your algorithm, that means that I like the idea (and is an implicit "I will merge this once you fix <these set of nits>"). Not all maintainers do this, but most learn to do this quite quickly.
> And this is how you become a contributor to high-profile projects like Docker, Akka, Spark etc, and now free to boast about it in your CV!
And get the same pay as everyone else that doesn't play any of these reindeer games, because recruiters already know how to find us and setup the interviews!
Of these, 1: Chose the project you want to contribute to (and 1.5: Choose the issue to work on) and 9: Follow up are the hard ones. Both are primarily social problems.
For 1, it's mostly about knowing yourself. What projects interest you, and where can you contribute?
For 9, it's convincing the owners that your contribution is a net positive. Start with 2: Check out how to contribute, and proactively reach out so your pull request doesn't come out of the blue.
Oh, and be willing to put your ego aside -- it can be tough to defend your work, particularly if you're a new (and thus haven't built up trust) contributor. It gets easier, both as the project learns to trust you and as you learn the work within their practices.
So actually, I was thinking about #1 the other day as well. When you go to GitHub.com, they now have /explore and /showcases. But, even if you find something interesting (say https://github.com/showcases/open-journalism), it isn't clear that any of those projects are suitable for contribution. Not everyone uses a CONTRIBUTING.md, but even more so, I think many of the showcased projects on GitHub fall into the "Free to make a copy of" not open to contribution.
So I keep coming back to what has been said elsewhere: the best way seems to be to find a bug in something you use, realize it's open source and go from there. That's unfortunately not a great way to mobilize the masses of people that could contribute, but don't have a particular project in mind (think GSoC).
> The way people (usually) contribute to an open source project on GitHub is using pull requests.
I disagree with this premise. The way people usually contribute to an open source project on GitHub is creating an issue or adding to a discussion. IMHO this is more valuable than actually writing code because it helps other developers gauge the relative demand for a feature/bugfix and sometimes you find out that other people have already solved the problem in their own forks.
I've been doing this without thinking about it for a while and after reading it from a beginner's perspective It seems like quite a few steps. It is the "right" thing to do though, as far as I know.
Why did this get so many upvotes? It's well written, but isn't it just a trivial guide on how to do a pull request? Not trying to be controversial, I'm just genuinely curious.
I would add "Make sure the maintainer of the repo will merge (or even look at) your PRs". I more than once had very reasonable PRs (bugfixes) waiting to be merged forever.
It’s amazing that there aren’t many articles like this one. I wrote something very similar [1] last year because I simply couldn’t find a complete, step-by-step guide that addressed the details of forking, branching, etc.
Is there a breakdown of top projects by language? I'm a C# developer and would love to put my skills to work on an open source project, but how do I find one?
[+] [-] jhchen|9 years ago|reply
Don't just create a fork, branch, and submit a PR without context. First, make sure the intent of your change is actually desired. Just because someone opened an Issue does not mean that it belongs in the project. Anyone in the world can open a Github Issue for any reason. Instead engage and discuss the Issue first and make sure it's actually something the project wants.
Don't just start writing code. Familiarize yourself with the codebase. This comes naturally if you are a user of the project, as you will naturally run into bugs or learn the software's behaviors and as you discuss the Issue or features with maintainers. There are far fewer right ways to build a feature than possible ways.
Finally, understand that your contribution is not "free" for the project. It takes time and consideration to even look at your PR and even more to code review it. The more popular the project, the more true this is.
[+] [-] fapjacks|9 years ago|reply
[+] [-] no_protocol|9 years ago|reply
I think all that matters is that the change is something you want. If no one else has any need for it, you can continue using it as long as you wish to maintain your fork.
It's great to collaborate and share ideas before you start working, but if it's something you need, you'll do it even if everyone else says it's pointless.
[+] [-] sdrinf|9 years ago|reply
Understand code interactions: they scale N^2 with each new feature added. Specifically, each interaction your feature has with all the other features has to be coded, and then each new feature might interact with yours. This is the curse of scope freak.
It is the sole responsibility of the PM/owner of the project to select which features worth this.
[+] [-] notyourwork|9 years ago|reply
[+] [-] csl|9 years ago|reply
Not just spaces vs tabs or block styles, but idioms and other idiosyncrasies, too. Why? Imagine reading a source repo where every second block uses different bracket styles, mixing spaces with tabs and so on. It's going to look like a kludgy mess, and will be distracting to read.
There is no correct style for most languages (perhaps `go fmt` might be an exception), only opinions.
[+] [-] danso|9 years ago|reply
That kind of practical thing is not well-covered in tutorials and self-learning curriculums.
[+] [-] the8472|9 years ago|reply
Personally I would take any formatting and just run an auto-formatter over the code section when I work on it the next time in case it bothers me.
Correct and sane code are far more important than hassling someone else to conform to a particular style.
In my opinion applying styles is a task for machines, not humans.
[+] [-] hzoo|9 years ago|reply
In Babel we use ESLint for this https://github.com/babel/babel/blob/master/Makefile#L20-L27
[+] [-] sctblol|9 years ago|reply
[+] [-] brobinson|9 years ago|reply
I keep 'origin' as the canonical remote and my local master branch tracks origin/master. I use people's usernames for their remotes (including for my own).
If I'm pushing a feature branch to my own remote:
If I need to checkout someone's PR, it's: I've seen people at work who are new to git/Github struggle a lot with the 'origin'/'upstream' differentiation recently, especially when they're learning branching, and they don't seem to have any problems once I switch them over to using 'origin' + usernames.[+] [-] rcfox|9 years ago|reply
http://stackoverflow.com/a/30584951
[+] [-] brobinson|9 years ago|reply
[+] [-] cynicaldevil|9 years ago|reply
[+] [-] dom0|9 years ago|reply
I grew so tired of that that I just wrote a little wrapper script, that just does the right thing (tm). So it's
for me.[+] [-] mholt|9 years ago|reply
And if a change is rejected, it's usually because there was not enough discussion beforehand about how to solve the problem, the change itself did not undergo enough discussion/iterations, or the change is not really a solution to a problem. (It's not the maintainers saying "Go away and never come back" -- more like, "Thank you for your effort! Please approach this differently.")
[+] [-] aban|9 years ago|reply
Git-Repo basically tries to put as many steps of the contribution workflow in terminal as possible, by using the API of the git hosting services. It currently supports GitHub, GitLab, and Bitbucket.
[0]: https://github.com/guyzmo/git-repo
[1]: https://news.ycombinator.com/item?id=12677870
[+] [-] hzoo|9 years ago|reply
Contributing can be a lot more than just PRs though: - answering questions on stack overflow, chat (irc, gitter, slack)
- creating a minimal code repro, checking for duplicates, checking if a bug is fixed in a later release/master branch
- writing tutorials/usage scenarios, giving talks, just using the project and providing feedback
- helping with documentation + website
- translations if possible
- reviewing other PRs
- helping with the changelog, testing prereleases
- adding to the discussion on issues
Bigger projects can have a pretty hard time with maintenance: fixing bugs, juggling PRs, making releases, answering questions, etc.
(We're looking for help on https://github.com/babel/babel and trying to figure out how we can make the project more contributor friendly!)
[+] [-] Mz|9 years ago|reply
I am not a programmer. I do copywriting for pay and I have run a bunch of different personal websites over the years (15+ years, I think). I know a little HTML and CSS.
I am interested in getting involved in open source via first working on copywriting and website stuff, since that is what I already have a background in. I am finding it extremely opaque to figure out how on earth to do that.
I left a couple of related comments on HN recently:
https://news.ycombinator.com/item?id=12860294
https://news.ycombinator.com/item?id=12851361
I do have a github account and I recently went through the Hello World on how to do pull requests.
Any tips on how to begin interacting with open source projects on the copywriting and website development piece of things?
Thanks.
[+] [-] cthulhuology|9 years ago|reply
[+] [-] kjksf|9 years ago|reply
If the author of the repo can't be bothered to look at the issue or a PR then I don't see how they'll be receptive to emails.
Most projects don't have other ways of contacting the people involved (very few list their e-mail addresses and even less have dedicated irc or slack channels).
Yes, it happens way too often that issues and PRs are ignored but realistically it's a strong signal to stop investing more time into such projects because it's unlikely things will improve.
[+] [-] mugsie|9 years ago|reply
Check that there is not a mailing list first. This is where the "read the contributing guide" kicks in.
I for example have quite a lot of filters, that make sure I reply to mailing lists, but if you email me directly, you may end up lumped in with all the commercial "include my $thing in your project please" emails (purely by accident)
[+] [-] roblabla|9 years ago|reply
This is what I usually do with my PRs. I first submit the PR so it's there for other people to grab if they need it, and then I'll look into the "proper" way of upstreaming the fix.
This avoids duplication of work and enhances discoverability.
[+] [-] atemerev|9 years ago|reply
Step 2: rant about it on HN / Github issues / whatever.
Step 3 (optional): try to reach developers on GitHub and get the obligatory "pull requests are welcome" response.
Step 4: In frustration, clone the repository, fix the damn bug and submit your pull request.
Steps 5..41: have an angry and emotional discussion with the devs who refuse to accept your PR because broken binary compatibility / regression tests / coding style / your choice of variable names etc. Fix all these issues and resubmit the PR until it's accepted.
Step 42: And this is how you become a contributor to high-profile projects like Docker, Akka, Spark etc, and now free to boast about it in your CV!
[+] [-] gosubpl|9 years ago|reply
Disclaimer: I am an akka and akka-http community contributor. I don't know how this works for other projects.
But I don't think the process is as painful as you describe in the akka world. My experiences are quite to the contrary. The community here is warm and welcoming. But, please look at this from the other side. Would you use the software in your mission critical application if the project was accepting quickly contributions from random strangers on the internets?
Please, find an issue marked as community in https://github.com/akka/akka/issues or https://github.com/akka/akka-http/issues and give it a go. How to solve the "credentialize yourself" problem? Be known to the commiters by first working on a docs issue.
[+] [-] logn|9 years ago|reply
Step 4 requires no frustration. It should be a relief you can fix it yourself rather than hope for some opaque engineering team to maybe fix it at some future date since your company doesn't have the Platinum Support package.
> Steps 5..41: ...
I think maintainers of small projects should do their best to merge whatever is given to them without back and forth or delay--just a prompt and sincere thanks. There's no reason to treat contributors like junior devs you're trying to teach and guide. Sometimes that requires swallowing some pride in having the code be just right or means you may want to fix up a few things after the merge.
[+] [-] agibsonccc|9 years ago|reply
It is the responsibility of the developers of the project to be responsive though.
I would just like to add that many open source projects you are looking at are funded
by companies with salaried developers (all of the ones you listed are).
These companies have customers that come first.
Their has to be due diligence on what goes in to the code base.
If you are ready to submit a pull request to a project and the developers are receptive
to teaching then view it as a learning experience.
Try to understand both sides though.
[+] [-] cyphar|9 years ago|reply
As a maintainer, I can completely see where you're coming from here and some of nits that I have against PRs probably seem quite trivial. However, if you have a fairly large project then as a maintainer the onus is on us to make sure that things don't break (and that the project remains manageable). Not to mention that the argument people use (that we should just make a follow-up commit that fixes the PR so it's actually acceptable) doesn't _really_ work if you're someone who has to do bisects often -- I'm not going to merge code that is clearly wrong in several places (unless the code is returning a "not implemented" error).
It's not a pride thing, as maintainers we have much less time than the contributors (there are fewer of us than there are of you) so we can't carry every PR that someone drops on our issue tracker.
Of course, for runC we have automated testing that verifies that things like golint and gofmt succeed (so we don't have to waste time going back-and-forth with contributors about it). Lots of smaller projects don't do this, which leads to nothing but annoyances for contributors.
Also, in general when reviewing a change I try to state whether or not the idea is sound. If I am commenting on your choice of variables or how you've structured your algorithm, that means that I like the idea (and is an implicit "I will merge this once you fix <these set of nits>"). Not all maintainers do this, but most learn to do this quite quickly.
[+] [-] unknown|9 years ago|reply
[deleted]
[+] [-] cloudjacker|9 years ago|reply
And get the same pay as everyone else that doesn't play any of these reindeer games, because recruiters already know how to find us and setup the interviews!
[+] [-] majelix|9 years ago|reply
For 1, it's mostly about knowing yourself. What projects interest you, and where can you contribute?
For 9, it's convincing the owners that your contribution is a net positive. Start with 2: Check out how to contribute, and proactively reach out so your pull request doesn't come out of the blue.
Oh, and be willing to put your ego aside -- it can be tough to defend your work, particularly if you're a new (and thus haven't built up trust) contributor. It gets easier, both as the project learns to trust you and as you learn the work within their practices.
[+] [-] boulos|9 years ago|reply
So I keep coming back to what has been said elsewhere: the best way seems to be to find a bug in something you use, realize it's open source and go from there. That's unfortunately not a great way to mobilize the masses of people that could contribute, but don't have a particular project in mind (think GSoC).
[+] [-] TAForObvReasons|9 years ago|reply
I disagree with this premise. The way people usually contribute to an open source project on GitHub is creating an issue or adding to a discussion. IMHO this is more valuable than actually writing code because it helps other developers gauge the relative demand for a feature/bugfix and sometimes you find out that other people have already solved the problem in their own forks.
[+] [-] unknown|9 years ago|reply
[deleted]
[+] [-] chmike|9 years ago|reply
[+] [-] Whackbat|9 years ago|reply
Additionally, the best way to learn git is to use it so try all the examples.
[+] [-] fphilipe|9 years ago|reply
[+] [-] dibanez|9 years ago|reply
[+] [-] winterismute|9 years ago|reply
[+] [-] tbarbugli|9 years ago|reply
[+] [-] infodroid|9 years ago|reply
[1] https://guides.github.com/activities/contributing-to-open-so... [2] https://guides.github.com/activities/forking/
[+] [-] matiasz|9 years ago|reply
[1] http://www.matiasz.com/2015/04/16/contribute-open-source-rep...
[+] [-] exhilaration|9 years ago|reply
[+] [-] hzoo|9 years ago|reply
I'd be much easier to find a project you use or know about so you have a lot more context into the project, it's usage, documentation, etc.
[+] [-] tintoy|9 years ago|reply
http://up-for-grabs.net/#/tags/C%23
[+] [-] unknown|9 years ago|reply
[deleted]