top | item 27777863

(no title)

Running a git server is a bit more challenging than running a traditional stateless web app because git is all filesystem centric. If you use NFS, pages that require many git operations can be very slow. Or if the storage servers have a low-level API some page loads might require many round trips and increased latency or long queues can make things bad.

Gitlab went thru a similar journey from NFS to a high level git api called gitaly:

https://about.gitlab.com/blog/2018/09/12/the-road-to-gitaly-...

https://gitlab.com/gitlab-org/gitaly

There are some other projects like this one that seek to address the problem:

https://github.com/takezoe/gitmesh

Git is already good and synchronizing between peers, but it's not low latency, so does require an extra management layer to make sure everything is correct.

discuss

handrous|4 years ago

I was on a project for which I built a kind of git hosting (as a side-effect of, or to support, other features of the product—we weren't a git-host-as-a-service, exactly) and ran into two things:

1) Locking. It's a pain in the ass. You're probably going to need to take it over or otherwise work around it, some how. I never did[0], but likely should have. At scale and with unreliable HTTP operations and all kinds of crazy stuff triggering writes to repos, you're going to end up with locking problems at some point.

2) Caching. Cache the hell out of metadata. Cache entire repo-wide metadata read operation output. Cache commits. Cache archives. Cache, cache, cache. Cache early, cache often.

[0] in my defense, I built the whole thing solo and there are only so many hours in a day, and that was not the only thing I was working on.

[EDIT] this was, like, 2011 or 2012 or something, so there was a lot less info floating around about how to do this, too.