There are serious problems with git submodules. This article, however, is simply concentrating on "you can forget to do X or not understand Y, in which case you can cause yourself minor irritations", which is just silly: if you understand how to use submodules all of the problems in this article go away and get replaced with more serious issues like "the submodule update mechanism doesn't get rid of obsolete submodules", "submodules can only exist in the root folder", "the mechanism for migrating between different upstream sources of a submodule (which will happen: this is a distributed version control system) require coordination with people using the code", and "for many people, who are attempting to use this in a context of a unified company, the lack of a solution to moving code back/forth between submodules makes moving to git a major step down from Subversion, where the subtree repository support made it entirely reasonable to store an entire organization's worth off projects, with binary art assets, together in a single master repository".
> a major step down from Subversion, where the subtree
> repository support
You could always checkout git-subtree. The functionality is there under the hood in git (git-subtree is literally a shell script, though not a minor one).
I've said this before but I'll say it again. Stay the hell away from Google's Repo tool. It's a half-baked badly maintained piece of ad-hoc software. It will completely destroy your git workflow. You'll also be married to the crappy review tool called Gerrit.
Repo was made prior to Git submodules to do the exact same thing for Android. Now that Git has submodule support, Repo is useless. It does pretty much the same thing as submodules, but it does it in a very crappy way.
For example, you cannot go back to a specific set of versions of subdirectories with Repo. In other words there's no "global" git bisect as there would be with submodules. This stops you from automatically finding a problem in one of the submodule repos if there are dependencies betweeen the problematic repo and other repositories.
"...the reason we made repo was because we didn't want to deal with commits in the super project, or trying to merge concurrent branches in the super project. Instead we wanted each subproject to use a floating branch as the revision it is tracking."
Git's submodule feature definitely has rough edges, however, I think the benefits outweigh the cons.
One of the best benefits I see is that submodules make embedding forks more manageble. For example, when you include an open source library in your own project, it's common that you'd want to modify the library in some way. If you commit the modified library into your own project's repository, you'll have a harder time absorbing bug fixes/features from upstream later on. Instead of a simple merge, you'd have to check out the updated library somewhere else and use a diff tool to compare the changes. And, if the library changed much, you may need to find the specific commit that your modification were based off so you can understand how to rebase your modifications.
In addition, submodules makes it easier to: contribute bug fixes/patches, share modified open source library across different projects, and identify bugs introduced in updated submodules (since the history is preserved).
If you want to make changes to the upstream, I suggest adding an extra mirror in between the upstream and the repo you want to add the submodule in. That way, you can maintain your version without worrying about how your other code is affected. Only after it is tested do you then merge it into your other repo.
So "if you forgot to run git submodule update, you’ve just reverted any submodule commits the branch you merged in might have made"... Yes, if you forget to type the correct commands then undesired effects will happen. Now that's a git design flaw? Give me a break.
having a controlled state for foreign repos is absolutely essential.
having worked with all-in-one repos, where external stuff is thown in... then rots... submodules are a better way, making keeping external code up to date. simple yet controlled.
Yeah, I agree, submodules prevent the copy and paste rot that can happen when you copy library code directly into your repo. We've been using submodules for most of our projects, and I don't think they are that bad as long as you are just a little extra careful with managing them.
> As it stands submodules don't even work sanely for the most trivial use-case of tracking a (slow changing) vendor-repo.
That is not the use case it was designed for. And submodules still work just fine for that. The fact that you have to do a separate submodule update does not make it broken. In fact, it makes it better when you start thinking about all the corner cases there are.
If you truly want to build a software out of subrepositories, you need to have a sane way of tracking working sets of revisions of all the submodules there are. And that is what git submodules does.
If you were blindly tracking the head of a repository, it would be useful _only_ for the use case you mentioned. Tracking slowly changing external vendor repos. That would be a whole lot less useful.
I think git subtree seems like the most sane solution here, the ability to split out updates to the subtree back upstream makes them just as useful as submodules without all the pain.
Submodules are pretty annoying if you modify the submodule repo a lot, but a pre-commit hook would help against the "forgot to push in submodule" issue, no?
Or a simple git or shell alias which would do the submodule update when you want it. That would solve most of the sources of complaints in the article and in this thread.
However, most likely you don't want an automatic submodule update because of all the issues there are. It would only be useful for tracking a very stable slow moving external dependency.
Agree with most of the posters, that Git submodules are very hard to work with.
The key to submodules is that you should not update them on a regular basis. A good example is the Gitflow project that uses the shFlags repo as a submodule.
A small gotcha is that you need to use --recursive when cloning the repo, so that you get the submodule cloned as well.
I've also tried and failed several times to use submodules. Live would be a lot easier for my situations if the parent could always point to the head of the child instead of a specific revision.
No it would not be a good idea for the submodules to be always pointing to the HEAD of their repos. It might make sense for syncing with some very stable external projects but that is a very limited use case.
I need to have a consistent set of all the submodules I'm working with. I need to reliably get the exact same versions of all the modules in the big repository. This allows me to do a "git bisect" to search for problems in submodules.
Doing "git submodule update" is not as bad as the OP suggests it is. It makes perfect sense to have it as it is.
Absolutely. I certainly understand Why it works the way it does. Especially when dealing with 3rd party modules, it's good to be explicit about which version of a submodule should be included. But having a setting that allows me to keep a submodule at HEAD at all times would certainly help me save quite a few keystrokes for my own internal libraries - most of which I quite regularly find myself crawling up the tree to add, commit, and push.
The problem with all wrappers and extensions is that you're no longer using vanilla Git, which probably breaks other tools like GUI clients. Plus the wrappers and extensions come with their own caveats, so it's still perfectly possible to get the repo into some broken state, only this time your scenario is even more esoteric. I so wish submodules got a better treatment in the vanilla Git.
[+] [-] saurik|14 years ago|reply
[+] [-] ryanpetrich|14 years ago|reply
[+] [-] pyre|14 years ago|reply
[+] [-] exDM69|14 years ago|reply
Repo was made prior to Git submodules to do the exact same thing for Android. Now that Git has submodule support, Repo is useless. It does pretty much the same thing as submodules, but it does it in a very crappy way.
For example, you cannot go back to a specific set of versions of subdirectories with Repo. In other words there's no "global" git bisect as there would be with submodules. This stops you from automatically finding a problem in one of the submodule repos if there are dependencies betweeen the problematic repo and other repositories.
[+] [-] valley_guy_12|14 years ago|reply
"...the reason we made repo was because we didn't want to deal with commits in the super project, or trying to merge concurrent branches in the super project. Instead we wanted each subproject to use a floating branch as the revision it is tracking."
https://groups.google.com/d/msg/repo-discuss/ZpqOOE5mLXo/Sw0...
[+] [-] mindjiver|14 years ago|reply
[+] [-] buddydvd|14 years ago|reply
One of the best benefits I see is that submodules make embedding forks more manageble. For example, when you include an open source library in your own project, it's common that you'd want to modify the library in some way. If you commit the modified library into your own project's repository, you'll have a harder time absorbing bug fixes/features from upstream later on. Instead of a simple merge, you'd have to check out the updated library somewhere else and use a diff tool to compare the changes. And, if the library changed much, you may need to find the specific commit that your modification were based off so you can understand how to rebase your modifications.
In addition, submodules makes it easier to: contribute bug fixes/patches, share modified open source library across different projects, and identify bugs introduced in updated submodules (since the history is preserved).
[+] [-] alexchamberlain|14 years ago|reply
[+] [-] aoprisan|14 years ago|reply
[+] [-] brazzy|14 years ago|reply
Even if you think you're perfect and never make mistakes, it's not a good idea to add easy opportunities to make mistakes.
[+] [-] bryanlarsen|14 years ago|reply
http://git.661346.n2.nabble.com/git-subtree-Next-Round-Ready...
This is excellent news. I've been using git subtree for a couple of years now without incident, and highly recommend it.
[+] [-] harshreality|14 years ago|reply
http://thread.gmane.org/gmane.comp.version-control.git/19648...
http://thread.gmane.org/gmane.comp.version-control.git/19560...
[+] [-] bryanlarsen|14 years ago|reply
[+] [-] aiiane|14 years ago|reply
[+] [-] irrationalidiom|14 years ago|reply
having worked with all-in-one repos, where external stuff is thown in... then rots... submodules are a better way, making keeping external code up to date. simple yet controlled.
[+] [-] andywhite37|14 years ago|reply
[+] [-] moe|14 years ago|reply
As it stands submodules don't even work sanely for the most trivial use-case of tracking a (slow changing) vendor-repo.
Whoever designed this (Linus?) had a real brainfart here.
[+] [-] exDM69|14 years ago|reply
That is not the use case it was designed for. And submodules still work just fine for that. The fact that you have to do a separate submodule update does not make it broken. In fact, it makes it better when you start thinking about all the corner cases there are.
If you truly want to build a software out of subrepositories, you need to have a sane way of tracking working sets of revisions of all the submodules there are. And that is what git submodules does.
If you were blindly tracking the head of a repository, it would be useful _only_ for the use case you mentioned. Tracking slowly changing external vendor repos. That would be a whole lot less useful.
[+] [-] harshreality|14 years ago|reply
http://blog.codekills.net/2011/07/14/nested-repository-handl...
[+] [-] etherealG|14 years ago|reply
[+] [-] jacobr|14 years ago|reply
[+] [-] exDM69|14 years ago|reply
However, most likely you don't want an automatic submodule update because of all the issues there are. It would only be useful for tracking a very stable slow moving external dependency.
[+] [-] sharken|14 years ago|reply
The key to submodules is that you should not update them on a regular basis. A good example is the Gitflow project that uses the shFlags repo as a submodule.
A small gotcha is that you need to use --recursive when cloning the repo, so that you get the submodule cloned as well.
[+] [-] rogerbinns|14 years ago|reply
[+] [-] exDM69|14 years ago|reply
I need to have a consistent set of all the submodules I'm working with. I need to reliably get the exact same versions of all the modules in the big repository. This allows me to do a "git bisect" to search for problems in submodules.
Doing "git submodule update" is not as bad as the OP suggests it is. It makes perfect sense to have it as it is.
[+] [-] enobrev|14 years ago|reply
[+] [-] cpt1138|14 years ago|reply
[+] [-] zoul|14 years ago|reply