top | item 3926683

Git Subtree merged into mainline git

113 points| moonboots | 14 years ago |github.com | reply

38 comments

order
[+] zbowling|14 years ago|reply
Subtree > Submodule in so many ways. Git submodule is a mess and now a legacy of git.

I've been been using subtree in a fork of Git personally but can't (safely) use it at work with everyone else (still use submodules there).

This is going to be a problem for github for reasons I put in my blog a few months back: http://zbowling.github.com/blog/2011/11/25/github/

Because Github has their explicit, top down, a-fork-is-only-a-fork-by-clicking-the-fork-button kind of graph between projects, using subtree won't work easily with their online tools to do pull-requests and see the network graph of forks.

Basically my repo is going to have the histories of 7 different projects combined in one repo and from that single repo I will be pushing back changes to all 7 (and others versions of those 7). Github's pull request feature is going to have trouble with that concept because they make the invalid assumption of a single upstream.

There are work arounds for sure like pushing your changes to a staging repo before finally doing a pull-request upstream but that is cumbersome. I'll probably write a shell script to automate it.

[+] jpeterson|14 years ago|reply
> Git submodule is a mess and now a legacy of git.

I hear this kind of thing all the time, but I don't understand why. I've never had any problem whatsoever with git submodule. Can you elaborate a bit on what you mean by this?

[+] avar|14 years ago|reply
It's been merged into the contrib/ directory, so it's not installed by default. Saying it's merged into the mainline has the connotation that it's available like any other tool, it isn't.
[+] sophacles|14 years ago|reply
So this looks pretty awesome. I can see myself using it a lot. There is one workflow that I do with submodules, that i don't see how to do in subtrees:

clone a repo to the target dir. Add it as a submodule. Decide to play with a branch of the submodule, so switch to that branch in the cloned repo. Then if i decide to work with that, update my submodules, otherwise switch back.

Is that sort of workflow available in subtrees? How do I do it?

[+] antidoh|14 years ago|reply
I've never used git in-the-large, which may be why I don't understand:

Why must some other library/module be part of my project? Why not reference and maintain the external lib/mod externally? We've been doing that for decades. It seems a solved problem.

Bringing an external library within the fold of your project feels like unnecessary coupling.

[+] eropple|14 years ago|reply
Dependency control it is a solved problem. Which is why SVN has externals, Hg has hgsubs, and Git has submodules and now subtrees. This isn't a git-in-the-large problem, it's an almost-any-nontrivial-project problem. It only seems unnecessary 'till you've had it and tried to live without it.

Bringing other libraries into my project is beneficial because then my build system is able to wrangle those just as easily as my code. (And it's also simply useful in the case where I've written both modules but maintain a separation for whatever reason--one is an open-source project and the other is not, whatever--to be able to make changes to one from within the other, run the tests for the submodule, and push it up to staging or upstream, without having to leave my current project.)

[+] nodemaker|14 years ago|reply
Yes exactly.

After you have divided your project into independent modules you have agreed that the changes in these modules are going to have minimal impact on each other, then what exactly is the point of merging the history of all those changes?.In my use case that would actually create a bigger mess.

Now I think this could perhaps be useful if there are modules that I have forked from elsewhere and the fork is going to be used in my project only.Although even then I dont see any downside of using submodules.

The downside of submodules is lack of good commands .For example a command to check out a different branch of each of my submodules - the branch which is used in this project.This could be done using the -for-each tag but its not trivial.

EDIT:In this thread zbowling makes a great argument against submodules.

There is a horrible habit of people forking projects on github just so their submodule stay stable

[+] Confusion|14 years ago|reply
For instance because you have componentized your product into different repositories, use different combinations of the components in different projects, but are still actively developing many of them simultaneously on each of those projects. Having to build, deploy and fetch gems (or whatever other method of distribution your language uses) is cumbersome if development is still really active.

I don't think you should ever want to bring unreliable external components, like a random github project that you aren't actively contributing to, into your tree like that.

[+] pyre|14 years ago|reply
One example would be:

  * https://github.com/altercation/solarized

  * https://github.com/altercation/vim-colors-solarized
The vim-colors-solarized repo is a subtree of the solarized repo. This is mostly a convenience for vim users that use something like pathogen + git-submodule to keep their plugins up-to-date. This way you can create a submodule @ ~/.vim/bundle/vim-colors-solarized and it would be the root of the bundle tree. If the vim colorscheme was only part of the larger repo, then users would be forced to create their own repos, or else do something like:

  git submodule .vim/bundle/.vim-colors-solarized
  ln -s .vim-colors-solarized/vim-colors-solarized .vim/bundle/
[+] adrianmsmith|14 years ago|reply
It's important to not only reference an external project (e.g. library) but also to reference a particular version of that library. (e.g. newer versions could remove deprecated methods which you are using, i.e. which weren't deprecated when you wrote your code.)

Different versions of your code could rely on different versions of the library (e.g. you update your code to a newer version of the library.) So which version of the library you rely on also needs to be version-controlled.

[+] lucian1900|14 years ago|reply
It makes it a little easier to maintain one or more forks of some library. Without submodules, you must have a separate repo for that, and there's no clear connection to the repo of your particular project.

This gets even more fun when you have a second project that needs an incompatible fork of the same library.

[+] parbo|14 years ago|reply
Maybe I'm stupid, but gitster/git doesn't sound like the mainline for git.
[+] gerrit|14 years ago|reply
It's the account of Junio C Hamano who is the main maintainer of git
[+] adig|14 years ago|reply
I've been trying out git subtree and I understand it's advantages. But I've wondered if there's a way to use it without adding the merge commits to the timeline when I update the subtree repository. Something similar to what git rebase does. (Maybe this is a dumb question and I'm not using git subtree right) :)
[+] etherealG|14 years ago|reply
Honestly I hope they actually do what the article title is suggesting and bring subtree into mainline. Congrats on making contrib Avery.
[+] zoul|14 years ago|reply
Good explanations of subtree merge:

    http://progit.org/book/ch6-7.html
    http://help.github.com/subtree-merge/
[+] mkilling|14 years ago|reply
git-subtree is not the same as a subtree merge. The author himself likes to point that out. A subtree merge is a one-time operation. git-subtree, on the other hand, enables you to continue to merge in upstream changes. You can also split the subtree (including it's history) from your repo and make it a standalone repository again.
[+] MBlume|14 years ago|reply
Reading the docs now, this looks like excellent work. Thanks to everyone involved =)
[+] ajray|14 years ago|reply
I hope to see Android's 'repo' tool improved to (optionally) make use of this.