(no title)
justjake | 18 days ago
We'll have a post mortem for this one as we always write post mortems for anything that affects users
Our initial investigation reveals this affects <3% of instances
Apologies from myself + the Team. Any amount of downtime is completely unacceptable
You may monitor this incident here: https://status.railway.com/cmli5y9xt056zsdts5ngslbmp
vintagedave|18 days ago
This affected a seemingly random set of services across three of my accounts (pro and hobby, depending on if this is for work or just myself.) That ranges from Wordpress to static site hosting to a custom Python server. All of the deployments showed as Online, even after receiving a SIGTERM.
While 3% is 'good', that's an awfully wide range of things across multiple accounts for me, so it doesn't feel like 3% ;) Please publish the post mortem. I am a big fan of Railway but have really struggled with the amount of issues recently. You don't want to get Github's growing rep. Some people are already requesting I move one key service away, since this is not the first issue.
Finally, can I make a request re communication:
> If you are experiencing issues with your deployment, please attempt a re-deploy.
Why can't Railway restart or redeploy any affected service? This _sounds_ like you're requiring 3% of your users to manually fix the issue. I don't know if that's a communication problem or the actual solution, but I certainly had to do it manually, server by server.
justjake|18 days ago
We rolled out a change to update our fraud model, and that uses workload fingerprinting
Since, in all likelyhood, your projects are similarly structured, there will be more impacted workloads if the shape of your workloads was in the "false positive" set
Will have more information soon but very valid (and astute) feelings!
iJohnDoe|18 days ago
iJohnDoe|18 days ago
Since there hasn't been any responses on the official support forum, maybe this will help someone.
I did a backup of our deployment first and did a Restart (not a Redeploy). Our service came back up thankfully.
Obviously do your own safety check about persistent volumes and databases first.
port3000|18 days ago
Looking to move. It's a bit of hassle to setup coolify and Hetzner but I have lost all trust.
suladead|10 days ago