They blamed the march and april outages on some database query that was changed due to an infrastructure change they rolled out. I'm guessing their infrastructure change caused some other race condition issue that they are only seeing after major production failure due to not load testing enough in their staging environment https://github.blog/2023-05-03-github-availability-report-ap...
Today seems worse than yesterday. I'm getting wildly inconsistent results when viewing repositories after a push. Hard to tell if my push actually went through, and it's not triggering actions.
From an SRE, one of their DB clusters failed. They use Vitess which is great, but it can be prone to hotspots and doesn't auto-shard. Heavy usage (esp. from large customers, rogue jobs) can take down the cluster. When it goes down, it's a PITA to resolve.
I'm considering host a gitea instance backup all of my repos.
I have an important fix that need to be deployed right now but there is no way to deploy it in a normal way with our CI which one was setup with Github Action. Fortunately I have a instruction to bypass CI and build the source by myself.
But again, Github defeat me because our release workflows are depend on GitOps which are effected by Github issue. Ahhhhhhhhhh I have to build the docker image, push it to ECR then update a YAML template to make EKS apply the new changes
It's 9PM in my timezone and I'm waiting for my patches are up. A frustrating incident
Gitea's ability to create a local repository as mirror of a remote repository is great for this. You can stay on Github and have your code regularly mirrored locally.
> Codespaces is experiencing degraded performance. We are continuing to investigate.
Imagine not only not being able to push your code, but also not even being able to _write your code_ at all. And so many orgs rely on Actions to even be able to deploy. Geez. I personally believe that the cloud sucks.
Three days in a row of outages, in less than a week of unreliability after yesterday's downtime of GitHub Actions [0].
Really at this point, you just might as well consider self hosting and it is looking very chronic with GitHub falling apart and self-hosting was indeed the sensible idea just like how the other open source projects have done for years.
GitHub is going just great, and centralizing everything to GitHub really was a good idea wasn't it? [1] /s
I've had some actions queued for multiple days now on certain repos, but not others. I've cancelled them and restarted them during the green status intervals but they all go back to "Queued". I've also cancelled them and then made slight documentation tweaks to get new commit hashes on the branches and it still goes to queued.
How??? You do the merge, which either creates a new commit for the change, or appends the commits to your existing tree. Then you push that to the remote. If the push fails, you can just push again, it's not lost. And if the merge failed, you didn't have any merge commit to begin with.
> going to self-host my git repos. Any recommendations?
Depending on your needs, this can be as simple as sticking repos on any server you have and cloning/pulling/pushing over ssh. If you want something more sophisticated, though, there's a handful of nice applications (gitea is being suggested further up-thread).
I'd say it's reasonable to list it as separate outages on the status page as it's really a representation of "is github available and working as expected". Even if it is the same issue, when they manage to mitigate it (or it goes away) I'd want to see that everything is now available from a user perspective.
That said, they're getting to the point where they really need to make some larger post about this. It seems reasonable to assume it is all from one root cause.
Loading github.com is returning a 500 for me currently, so seems like more than just issues/pull requests. Also seeing actions fail with 500s on assorted steps.
Similar issues for me. I can load github.com and my profile, but visiting a repository (or trying to git pull a repo with the https origin) returns a 500.
I can confirm this as well. I started seeing 500 errors intermittently when trying to view pages, so I checked status page and saw everything was green. Status page started showing the incident within about 3 minutes of when I started seeing issues. Clearly that's all based on happenstance of when I was landing on GitHub's website, but I have found that of all the status page's by large companies, GitHub's is almost always showing an incident as soon as I start noticing issues myself.
Yeah, I was getting 500s for about three minutes before they posted the status update. I guess it's good that they at least update the status page in a timely fashion, but the third day in a row of downtime is not exactly good service.
Yes, we do this using https://gitea.io/en-us/ on a private server (firewall, backups and a replica) for most projects. Github is only used when it's required by a stakeholder.
Are you interested in spinning up an entire CI environment? or something where anyone can push a branch to a file based mirror?
There are aspects of permissioning that the cloud git repo providers have that become more challenging to implement as a home grown solution and unless you have the resources to maintain it, it also becomes interesting.
On one hand, you can do `git clone --mirror` and you'll have a copy of the repo and put that on a file share... though there's no permissions or automatic syncs for it (or CI). If you want those, then you get into some development (and maintenance) of the git hooks.
Going to things like a local hosted gitlab instance means that you need to have a local docker hosted environment running, and someone to maintain that, and the storage for it, and all of the other fun that comes with administering a complex 3rd party application on prem. When things are going good, it's an hour or two a week... when something breaks its several hours with calls to support (you're using a paid / licensed version to get support... right?) from someone who has a sysadmin skill set rather than a developer skillset. And don't forget about DR.
In the past I worked at a company which used the commercial solution from JFrog, I don't remember ever having problems with git availability as a user.
intunderflow|2 years ago
It's a feature, not a bug!
agos|2 years ago
willsmith72|2 years ago
dbingham|2 years ago
aranw|2 years ago
iepathos|2 years ago
frde|2 years ago
zamalek|2 years ago
candiddevmike|2 years ago
SideburnsOfDoom|2 years ago
I suspect that it's related to high load.
pera|2 years ago
tonyhb|2 years ago
buglungtung|2 years ago
I have an important fix that need to be deployed right now but there is no way to deploy it in a normal way with our CI which one was setup with Github Action. Fortunately I have a instruction to bypass CI and build the source by myself.
But again, Github defeat me because our release workflows are depend on GitOps which are effected by Github issue. Ahhhhhhhhhh I have to build the docker image, push it to ECR then update a YAML template to make EKS apply the new changes
It's 9PM in my timezone and I'm waiting for my patches are up. A frustrating incident
galleywest200|2 years ago
michaelmure|2 years ago
isaacdl|2 years ago
joostlek|2 years ago
Vasniktel|2 years ago
gtrax|2 years ago
https://www.cnet.com/culture/windows-may-crash-after-49-7-da...
VWWHFSfQ|2 years ago
Imagine not only not being able to push your code, but also not even being able to _write your code_ at all. And so many orgs rely on Actions to even be able to deploy. Geez. I personally believe that the cloud sucks.
gabrielgio|2 years ago
It's sill weird to me how many and how much companies relly on Github infra.
zomglings|2 years ago
This is the error I was seeing:
candiddevmike|2 years ago
timvdalen|2 years ago
cloudking|2 years ago
dbingham|2 years ago
Of course, if this continues happening it may be time for the OSS community seriously consider migrating en masse.
[Edit to fix grammar - thanks for the corrections!]
jgadelange|2 years ago
goodoldneon|2 years ago
PestoDiRucola|2 years ago
c16|2 years ago
ta1243|2 years ago
rvz|2 years ago
Really at this point, you just might as well consider self hosting and it is looking very chronic with GitHub falling apart and self-hosting was indeed the sensible idea just like how the other open source projects have done for years.
GitHub is going just great, and centralizing everything to GitHub really was a good idea wasn't it? [1] /s
[0] https://news.ycombinator.com/item?id=35887029
[1] https://news.ycombinator.com/item?id=22867803
ta1243|2 years ago
(/s of course)
melx|2 years ago
Right now I can ignore that my PRs show 500 error, or old code best case scenario.
But... I cannot build and ship due to project's dependency depending on some stuff hosted on GH.
dijit|2 years ago
samwillis|2 years ago
Also hugs to any Devs, Opps or SREs directly effected by this outside GitHub.
Looking forward to a post-mortem on the last few days, I'm sure it will be a really interesting read.
jamespetercook|2 years ago
booleanbetrayal|2 years ago
martiuk|2 years ago
dugmartin|2 years ago
atl4s|2 years ago
capableweb|2 years ago
melx|2 years ago
The git+nginx would suffice but it does not offer GUI. I need one to see the changes proposed (aka PRs).
Gitea is nice, but a bit overkill for my needs. I don't need CI, files hosting, issues, team members, releases, wiki, forking/watching/staring, etc.
justinclift|2 years ago
yjftsjthsd-h|2 years ago
Depending on your needs, this can be as simple as sticking repos on any server you have and cloning/pulling/pushing over ssh. If you want something more sophisticated, though, there's a handful of nice applications (gitea is being suggested further up-thread).
dboreham|2 years ago
greenie_beans|2 years ago
bluehatbrit|2 years ago
That said, they're getting to the point where they really need to make some larger post about this. It seems reasonable to assume it is all from one root cause.
MattIPv4|2 years ago
zachallaun|2 years ago
mostafah|2 years ago
But strange that it keeps happening almost every day now.
darrenkopp|2 years ago
longwave|2 years ago
acyou|2 years ago
tommy_axle|2 years ago
shagie|2 years ago
There are aspects of permissioning that the cloud git repo providers have that become more challenging to implement as a home grown solution and unless you have the resources to maintain it, it also becomes interesting.
On one hand, you can do `git clone --mirror` and you'll have a copy of the repo and put that on a file share... though there's no permissions or automatic syncs for it (or CI). If you want those, then you get into some development (and maintenance) of the git hooks.
Going to things like a local hosted gitlab instance means that you need to have a local docker hosted environment running, and someone to maintain that, and the storage for it, and all of the other fun that comes with administering a complex 3rd party application on prem. When things are going good, it's an hour or two a week... when something breaks its several hours with calls to support (you're using a paid / licensed version to get support... right?) from someone who has a sysadmin skill set rather than a developer skillset. And don't forget about DR.
q3k|2 years ago
cube2222|2 years ago
aeyes|2 years ago
In the past I worked at a company which used the commercial solution from JFrog, I don't remember ever having problems with git availability as a user.
c12|2 years ago
trollied|2 years ago
est|2 years ago
nickthesick|2 years ago
Vermyndax|2 years ago
WesolyKubeczek|2 years ago
It would show a very prominent zero and be a static page with no logic whatsoever.
bhouston|2 years ago
daniaal|2 years ago
megadopechos|2 years ago
ryandvm|2 years ago
yeck|2 years ago
SettembreNero|2 years ago
mminer237|2 years ago
Edit: 10 minutes later, the Github finally shows the push, but triggers still aren't working.
Edit #2: Things are working normally now.
rhymeswithjazz|2 years ago
moltar|2 years ago
talboren|2 years ago
dboreham|2 years ago
remorses|2 years ago
gabrielizaias|2 years ago
talboren|2 years ago
hideoncode|2 years ago
[deleted]
unknown|2 years ago
[deleted]
mr90210|2 years ago