Don't include configs in your Git repos

[+] pragmar|10 years ago|reply

It's good to understand the trade-offs and not blindly be adding binaries and configs, but source control conventions should boil down to what works best for you and your team. For configs, I've found a reasonable solution is to include a base config and override with a repo-ignored, local config containing system/db specific values that do not belong in source control.

[+] darylteo|10 years ago|reply

'Don't include "sensitive" configs in your Git repos' should be the correct title.

With regards to external dependencies, Vendoring is a source of debate and I don't think the larger community agrees on one rather than the other. More recently, Golang recommends Vendoring, and last I heard, refuse to support dependency management, so if anything you should look at their arguments for doing so.

https://www.reddit.com/r/golang/comments/2xpulb/the_go_team_...

I do thing this is important to absorb though:

Point 1 - a build must always be reproducible and deterministic. Point 2 - a build that pulls in external dependencies relies on its external dependencies to be reproducible and deterministic.

When Point 2 is not 100% certain (reliability, changeable artifacts), Point 1 goes out the window. Therefore, Vendoring is a crucial strategy from a build engineering perspective.

The most common strategy to mitigate this in a non-vendored build is to have control over the external dependencies by deploying your own dependency platform for your organisation. The latest Artifactory, for example, has proxy support for many mainstream artifact repositories (gems, eggs, dockers, images, modules, bowers, etc.).

Now... this project I inherited has had modifications made to the vendored dependencies... smashes desk with head

[+] scintill76|10 years ago|reply

> Do not include your compiled files or binaries in a git repository! Binaries (or executable files) will almost always be operating-system-dependent. This will cause major headaches if you have some developers using Macs, some using Windows, and some using Linux

I think you're doing something wrong if this is an issue. Unless you run on Windows, Mac, and Linux in production, your developers should not be locally running disparate platforms, they should be running whatever the production server does. There may be other reasons not to include binary programs, but this isn't really it.

> Furthermore, there are certainly good reasons (read: file-insurance) to commit these kinds of files.

What he calls "insurance", is basically the whole point of using revision control to me. I want everything linked together in a versioned dependency tree that's stored and manipulated with one interface. Whether it's code, text, or images, if it's logically part of the project, then it all goes together. If git can't do that for you, it's not the right tool for the job.

> Do not include downloaded dependencies in a git repository!

Works great if Github or whoever is around for as long as you want to have a complete history, and the project owner doesn't push out new incompatible code under the same revision, and the project owner doesn't delete "that old version nobody uses anymore"...

[+] SOLAR_FIELDS|10 years ago|reply

There are some use cases to include compiled files in Git. As a practical example, I run a project that is mainly Java at my place of work, but has a Node.js dependency that will likely never change. We aren't sure that wherever we deploy internally will have Node installed, so we use nexe to compile into UNIX executable that the Java then consumes. Its unreasonable to ask our user base to wait for nexe to compile the entire node library every time (which takes around10 mins if you've never dealt with nexe) they want to make a small change to the Java classes, so we bundle the compiled node to drastically cut down on build times with Gradle. Granted this is a pretty edge case but there are definitely reasons you would want to completely disregard this article's advice on that front.

[+] manicdee|10 years ago|reply

That's what submodules help you with. Locally stored version of the dependency, but it's not in your main project repository, and you get to send pull requests for bug fixes easily.

[+] elchief|10 years ago|reply

Put your passwords in env vars people. By that I mean your web server-to-db connection or 3rd party api passwords. See 12factor.net .

Then you can still store your config files in git.

[+] jsmeaton|10 years ago|reply

I still haven't seen or found a good way to deploy secrets in environment variables. My current practise is to keep secrets in a file on the build server, and the build server is responsible for positioning it correctly.

[+] EvanPlaice|10 years ago|reply

This is clearly outlined in 'The Twelve-Factor App'.

http://12factor.net/config

System-dependent configuration should be set via ENV variables to prevent the following: one, it prevents sensitive credentials from mistakenly being added to source control; two, it simplifies deployments and prevents mistakes (ex pushing a debug configuration to production) from exposing the underlying structure of your applications.

You should never have to set a --debug flag or parameter via a config file. Rather, your development machine itself should have the environment configured to enable debugging.

There really is no downside to this approach. Environmental Variables are trivial to configure and once they're set they usually don't need to change.

[+] Stasis5001|10 years ago|reply

Dumb question. When, say, setting up a new server, you need to set up the environment variables to have your production credentials. How do I automate this process if the variables aren't stored in a central repo? Keeping a static file on my computer could maybe cut it, but won't help me get new team members able to deploy, unless I email them that file or something. Is having a file stored in email or dropbox or something really way better than having it stored in a private git repo? I assume there's a clever way to do this that I'm missing?

[+] deathanatos|10 years ago|reply

My one nit is that I take the stance that .gitignore is only for ignoring things that are "of the nature" of the project. A developer's personal tooling preference that might not be shared by others is not "of the nature of the project" — the project itself doesn't cause a developer to have an affinity for a particular tool; these belong in an exclude file, usually the user's global excludes file[1].

.DS_Store is an example of this: this is a file caused by the tools you're using to develop with, which are likely particular to you. My system does not generate these files. For the same reason, I wouldn't include vim swap files[2]. A ".pyc" ignore in a Python project, or a .o ignore in a C project, however, is correct, because the nature (in this case, the language in use) of the project causes* that ignore to be needed. Ask yourself who needs the file ignored: The project (.gitignore), or you (exclude)?

The big benefit is that you need only exclude (or ignore) the file once: .DS_Store is likely not only invalid in one repo, but in all repos. So exclude it globally. Also, the rest of us aren't wondering "what's a .DS_Store?" `man gitignore` has a decent passage about when to use which.

[1]: See `man gitignore`, and `man git-config`'s `core.excludesFile`.

[2]: ".⋆.sw⋆", which is all too often ".⋆.swp" or "⋆.swp" (note: unicode magic here to avoid special meaning of "⋆"…; also, HN strips out U+2736-7's?)

[+] codemac|10 years ago|reply

Totally disagree about dependency management. If you check them into your repo, people who use your repo to build do not need to fetch other dependencies which may or may not exist over the network anymore. This is especially true if you ever intend to build an old version of your software.

And if your library doesn't support multiple platforms, I'm not sure I fully understand how omitting the source code of a dependency from your repo magically gives you platform independence.

[+] EvanPlaice|10 years ago|reply

1. They pollute the source history with code that doesn't relate to the project. Every time I see a repo for a trivial application that has 100K+ LOC I think, "wow... this person clearly doesn't understand how to use package management tools". Dependencies should be clearly outlined in the config and easily setup using a package manager. You can either download the deps source from your repo or you can download it from a package manager, either way you'll need to download it.

2. If your dist requires a compilation stage, it will likely generate binaries targeted specifically to the architecture they're compiled on. Automating the compilation stage enables users to compile the source to work on their specific architecture (ie hence platform independence).

3. Dependencies should be frequently updated and tested against the latest. Not doing so potentially exposes you to bugs and/or security risks that may be included in future updates. If your app has relatively good test coverage there's no reason you shouldn't be using the latest minor version for all of your dependencies.

Not doing so is assuming responsibility for the consequences of running stale code.

[+] dnbdnbdnb|10 years ago|reply

With regards to bower components, it's typically a good idea to leave the folder in your repo: http://stackoverflow.com/questions/22327758/should-bower-com...

[+] EvanPlaice|10 years ago|reply

Just another example of why Bower is on the path to go the way of the dodo.

When ES6 rolls out with System.js (ie polyfills exist now). Javascript will have native support for modules.

In addition, JSPM will replace Bower as the tool-of-choice for managing client-side javascript dependencies.

Bower was a good 'stepping stone' for an ecosystem of broken module management. Fortunately, module support is improving.

[+] wavefunction|10 years ago|reply

I would rather just leave bower.json in the repo. What's the point of bower if you're including the /bower_components folder?

edit before someone snarks about rtfa, I did. I disagree. What is the point of bower if you're checking in components.

[+] phillijw|10 years ago|reply

I don't fully agree with the config thing. In microsoft land you can transform your configs based on a build parameter. And if you're using MS SQL you don't even need to use a password login, you give the service account the permissions to access the database. It's pretty harmless and fairly standard.

[+] jakejake|10 years ago|reply

The configs for .net I would definitely consider part of the code - they can have some really intense setup for the project!

If I was forced to put credentials in there, then I would ignore web.config, but commit a web.config.default file without the credentials. Part of the setup instructions would include copying the file after cloning the repo and putting in your own machine-specific credentials.

[+] drakenot|10 years ago|reply

Where do you keep your configs then?

For some of my projects I maintain a separate Ansible playbook repo and most of my configs/templates in it.

I also checkin my Pods directory for my iOS projects. I know there is debate around this but it has been the least painful option for me working on a team.

[+] wtbob|10 years ago|reply

> Where do you keep your configs then?

In a different git repo.

In my mind, the developed code and the deployed environment are, or should be, quite separate. I've been pretty successful with this in the past, and when I've been on teams which violate it…suffering has happened.

[+] dvanduzer|10 years ago|reply

When you're running a SaaS business, your configs are your code in many cases.

Version control is incredibly important for scaling operations. To keep things manageable at scale, you also need to separate concerns properly. But this just means you shouldn't do things like have your configuration API refer to its own repository to configure itself.

edit: CocoaPods in particular is a weird artifact that I suspect will go away now that iOS is supporting dynamically linked frameworks.

[+] EvanPlaice|10 years ago|reply

Application config != environment config.

Configuration for setting up your application will be checked in as usual. Configuration to switch modes (ex testing/production) as well as config for sensitive information (ex auth credentials) should never be checked in.

As an example, there was a news story that popped up lately of a guy who accidentally published his AWS private key to a GitHub repo and ended up being charged for thousands of dollars of usage when somebody saw it and decided to use it for malicious means.

Same goes for testing/production. There's no excuse for displaying debug info to users. You're basically giving hackers a fast lane to fuzzing and exploiting your server.

[+] seivan|10 years ago|reply

There is some massive pain of having Pods checked in in the beginning of a project, but once you get more stable it's alright.

[+] jamiewildehk|10 years ago|reply

Ahh, but do you then commit your .gitignore file

[+] twblalock|10 years ago|reply

I always do. My company has a canonical .gitignore that we update when we see things in git repos that should not be there. It now contains all of the stupid hidden project directories for all of the major IDEs, build artifacts we don't want, and a bunch of other stuff.

I highly recommend this approach.

[+] deathanatos|10 years ago|reply

Absolutely. .gitignore is a project-level declaration that "these files do not belong in source control". Keeping that in source control not only makes sure everyone gets the same copy, but also allows changes to be annotated by relevant commit messages.

[+] dvanduzer|10 years ago|reply

Subtle.

[+] krapp|10 years ago|reply

yes?

[+] draw_down|10 years ago|reply

There are good points here, but it's possible to take a more sophisticated approach to these things. You don't have to leave configs out entirely just to avoid putting passwords in the repo. Just don't put passwords in the repo. Also, checking in CSS files compiled from Sass (for example) can be a pragmatic way around the dilemma of having to deploy a separate Ruby stack, or have a more complicated build process (small projects only). I've actually never worked in a place that didn't check in images. Where would they go?

[+] BorisMelnik|10 years ago|reply

One of the biggest offenses I've seen lately in this regard are developers that keep their database name, username and passwords in their CMS_config (wp-config.php is a big one.)

[+] cleaver|10 years ago|reply

Wordpress seems to be well behind in good dev practices. I do know Drupal adds the settings.php to the default .gitignore. It would take a bit more than simple ignorance to add setting to the repo.

[+] twblalock|10 years ago|reply

Corollary: include an example config file somewhere, so new developers can figure out how the heck to get the app to start up properly.

41 comments