Git from the Inside Out (2015)

[+] troughway|6 years ago|reply

>Git from the Inside Out

https://github.com/git/git

https://www.aosabook.org/en/git.html

https://codewords.recurse.com/issues/two/git-from-the-inside...

http://gitimmersion.com/

[+] neves|6 years ago|reply

Why these links are better than the original article?

[+] akkartik|6 years ago|reply

> Notice how just `git add`ing a file saves its content to the objects directory. Its content will still be safe inside Git if the user deletes data/letter.txt from the working copy.

Holy crap, how do I not know this in 14+ years of working with git?!

The `git add --help` manpage seems to make no reference to this feature, it just talks about adding the file to the index.

[+] coldpie|6 years ago|reply

It's not really a feature, more of a side-effect. Git-add causes git to record the state of added files. You can see this because if you make changes to an added (but uncommitted) file, you can see the diff between that uncommitted index and the state on disk. That index state must exist somewhere. Where it exists is in the object dir, just like everything else Git knows about.

(The article is slightly incorrect in that I think Git will eventually delete unreferenced state files during git-fsck; it's not stored forever. But there's a lot of heuristics during fsck to help keep data that could be valuable if the user messed up.)

[+] skrebbel|6 years ago|reply

The staging area^W^Windex^Wcache is a terribly designed mess, that's why.

The best way to think about it is it's just a half-finished commit, i.e. one without a message and an author and a date. But otherwise git treats the index like any other commit. Adding stuff to the staging area is like amending that commit. Actually committing it is like amending that commit again, but without changing the files, only editing the metadata (message, author, etc). And then moving the current branch to it.

You could totally simulate the cache by doing exactly this, i.e. a series of `git commit -a --amend` commands (just make sure you don't push halfway). The idea behind the staging area is that you obviously need this all the time, because reasons, so let's force you to go through the hassle for every commit you might want to make.

Because it's just a commit that hates your guts, it has all the same side effects as making a real commit has.

[+] james-skemp|6 years ago|reply

While confusing for some, I love this part of `git add`.

While branching is already easy enough, I'll regularly get to a point where I may want to spend a few minutes going down a path. I'll either be happy where it leads me, or realize it was a bad idea and scrap it.

I'll `git add` the current state, make the changes I want, test it out, and then either revert back to what's staged, or like where I'm at and `git add` the rest in.

That and `git add -p` also mean that I rarely do a `git commit -a` or the like; stage it, then commit it.

[+] frenchyatwork|6 years ago|reply

It's certainly an understandable point of confusion. It's not clear to me if the current behavior was actually intentional, or just an byproduct of implementation.

If you add a file, then modify it, and then commit it, you're old version gets committed. That caused me a bit of confusion back in the day.

[+] dang|6 years ago|reply

A thread from 2016: https://news.ycombinator.com/item?id=12802949

A bit from 2015: https://news.ycombinator.com/item?id=9793069

Discussed at the time: https://news.ycombinator.com/item?id=9272249

[+] sdan|6 years ago|reply

How did they generate those nice looking graphs?

[+] maryrosecook|6 years ago|reply

Hiya! Author of the article here. I used OmniGraffle.

[+] unknown|6 years ago|reply

[deleted]

[+] shadykiller|6 years ago|reply

I gave a similar talk “Inside Git Guts with Ruby” at RubyConf India 2013 - https://m.youtube.com/watch?v=lPlwkxrG2NM

I had to learn a lot of git internals and it was super fun

34 comments