top | item 44586064

Artisanal handcrafted Git repositories

266 points| drewsberry | 7 months ago |drew.silcock.dev

68 comments

order

bradfitz|7 months ago

My recent horror from some git work was discovering how git sorts its tree objects.

The docs just say to sort by C locale (byte-order sorting). Easy. Except git was sometimes rejecting my packfiles as being bogus per its fsck code, saying my trees were misordered.

TURNS OUT THERE'S AN UNDOCUMENTED RULE: you need to append an implicit forward slash to directory tree entry names before you sort them.

That forward slash is not encoded in the tree object, nor is the type of the entry. You just put the 20 byte SHA1 hash, which is to either a blob or a hash (or a commit for submodules).

So you can have one directory with directory "testing" and file "testing.md" and it'll sort differently than a directory with two files "testing" and "testing.md".

You can see a repro at https://gist.github.com/bradfitz/4751c58b07b57ff303cbfec3e39...

(So to verify whether a tree object is formatted correctly, you need to have the blobs of all the entries in the tree, at least one level)

xqb64|7 months ago

I've had this exact bug happen to me when I implemented my git clone.

The way I found out was that Github kept rejecting my push, because as I later discovered, my git history was invalid precisely due to entries being sorted improperly due to the forward slash requirement. I could have solved this with the real git, but the point was to use my tool exclusively for version control from inception, so I just deleted the .git folder. So, my git history appears to begin near the end of the whole cycle. But I did manage to learn a lot, both about git and about the language I implemented it in.

Elucalidavah|7 months ago

> directory tree entry names

But... git doesn't really store directories, does it?

lucasoshiro|7 months ago

Something that I really like in Git is how its data structures are easy to understand and how transparent it is. It's possible to write your own "Git" compatible with existing Git directories only by reading how it works under the hood

shivasaxena|7 months ago

I agree, but only in theory.

Projects like gitoxide have been in development for years now.

veganjay|7 months ago

Neat to see this done by hand! It helps demystify the magic behind git commands.

If you like this, I also recommend "Write Yourself a Git", where you build a minimal git implementation using python: https://wyag.thb.lt/

xqb64|7 months ago

There is also James Coglan's "Building git" book that I just went through and can vouch for its quality.

sc68cal|7 months ago

To the site author: I'm on a MBP M1 Mac and honestly I can't really read the text. Far too small, and increasing the zoom just makes the text large but the margins less wide. Firefox reader mode also renders really badly.

Please, consider making the layout better for us old coders whose eyes are going, or for hi res displays

derefr|7 months ago

FYI: the pinch-to-zoom gesture from mobile browsers (from before websites were mobile-responsive) has also long been implemented for all modern desktop browsers. It's viewport zoom, which is far better than the font-scaling zoom you get by pressing Cmd-+, and makes this site easily readable.

(The much-less-well-known mobile double-tap-on-text gesture [it zooms-to-fit whatever element you tapped on to the width of the viewport] was also ported to desktop browsers. Though, on desktop with a touchpad, it's a two-finger double-tap — which I don't think anyone would ever even think to try.)

drewsberry|7 months ago

Thanks for letting me know. I'm on an M3 and haven't experienced any issues myself, but allowing for font size configuration seems like a pretty good idea. I'd added this in so if you happen to take another look, I'd be interested to hear whether you think this is an improvement.

retsibsi|7 months ago

For me, the text size would be fine if the contrast were better. The background colour is similar to the colour of the non-central pixels of the text, and even the central pixels are grey rather than black.

sam_lowry_|7 months ago

Works great on Firefox for Android though )

HexDecOctBin|7 months ago

Okay, there's something I have been thinking about recently. Is it possible to somehow make Git use the Content Defined Chunking algorithm from rsync? Maybe somehow using clean/smudge? If not git, then maybe Mercurial, Fossil or any other DVCS?

This would help with large binary assets without having to deal with the mess that is LFS, as long as the assets were uncompressed.

hanwenn|7 months ago

IIRC it already uses content defined chunking for finding object deltas.

aeblyve|7 months ago

I thought this was going to be a sardonic article about doing programming without LLMs.

lioeters|7 months ago

I'm starting to see this kind of wording as a unique selling point, that some software (or article, visual art, etc.) is handcrafted and artisanal, as opposed to AI-generated. "Every word was written by me, a human being!" At this point in the emerging technology I can usually tell the difference intuitively, but it's possible that one day it will be indistinguishable - and the quality of "handmade" will be simply a matter of branding for niche enthusiasts, like vinyl records.

lemming|7 months ago

Git refers to the user-friendly commands as “porcelain”

Ahhhhahahaha… “user friendly”. When compared to coding the repo by hand, I guess.

antonvs|7 months ago

This is what happens when you let an OS kernel guy write a cli.

kassah|7 months ago

The simplicity of Git is awesome. Great article! I had looked at what it would take to find a single file in a remote git repo. I decided against talking the git protocol directly and just checking out the entire repo to get a single file. Reading through this makes me think I may have given up too easily.

I asked a few git hosting providers, and they all said they had private APIs developed internally for the purpose.

BobbyTables2|7 months ago

I realize the concept is very similar but would love to see a writeup on bow Docker stores images using OverlayFS. (Has quite a bit of metadata!)

mitchitized|7 months ago

I closed the tab as soon as I saw `ignorecase = true`.

Absolutely NOT going there again.

* points at numerous scars and trauma

jllyhill|7 months ago

Am I the only one having troubles with the site on mobile? I'm using Firefox on a decent Android phone but the scroll is extremely stuttery and it distracts from the article unfortunately.

styanax|7 months ago

The site is built with a content creation tool which has used a lot of JS and CSS, but the CSS is atrocious in it's automated output so it's triggering the browser to have to interpret the mess of directives in every code block. The tool is generating HTML trash like (brackets replaced for comment to not parse):

    [span style="--0:#E1E4E8;--1:#24292E"] [/span]
...over and over, essentially giving style directives for every blank space in the code block. A less capable mobile CPU may well have issues rendering this site due to the presence of so much trash CSS inside it guts. $0.02 hth

drewsberry|7 months ago

There was a silly on scroll listener that was doing basically nothing – I've tried removing that so if you happen to visit again, I'd be very grateful if you could let me know whether it is still happening (I can't reproduce it myself).

DrBazza|7 months ago

I'm glad I clicked through to the actual article rather than dismissing it via its slightly silly title. I learnt a few things about git, and I didn't realize that the tool `pigz` existed. Today I learnt...

deadbabe|7 months ago

[deleted]

ChrisMarshallNY|7 months ago

My understanding is that Mercurial is sort of Beta to Git's VHS. There are some definite advantages, but it's losing support.

zanecodes|7 months ago

I thought all the cool kids were on Pijul, or was it Darcs? Maybe it was Fossil? No wait, it was definitely Jujutsu.

gerdesj|7 months ago

This is all very well but how does Linus Thorvalds use git? Given he invented the bloody thing, it might be nice to see how the Boss uses it!

git was created to scratch an itch (actually a bit of a roiling boil, that needed a serious amount of soothing ointment and as it turns out: a compiler, some source code and quite a lot of effort). ... anyway the history of it is well documented.

FFS: git was called git because a Finnish bloke with English as a second, but well used, tongue had learned what a "git" is and it seemed appropriate. Bear in mind that Mr T was deeply in his shouty phase at that point in time.

Artisanal git sounds all kinds of wrong 8) Its just a tool to do a job and I suggest you use it in the same way as the XKCD comic mandates (that is the official manual, despite what you might think)

The Conclusion is spot on - great article.

lysace|7 months ago

I would have called this: "Futzing around with internal git data structures".