It's weird that one of the reasons that you endorse AWS is that you had regular meetings with your account manager but then you regret premium support which is the whole reason you had regular meetings with your account manager.
As a counterpoint, I find our AWS super team to be a mix of 40% helpful, 40% “things we say are going over their head,” 20% attempting to upsell and expand our dependence. It’s nice that we have humans but I don’t think it’s a reason to choose it or not.
GCP’s architecture seems clearly better to me especially if you are looking to be global.
Every organization I’ve ever witnessed eventually ends up with some kind of struggle with AWS’ insane organizations and accounts nightmare.
GCP’s use of folders makes way more sense.
GCP having global VPCs is also potentially a huge benefit if you want your users to hit servers that are physically close to them. On AWS you have to architect your own solution with global accelerator which becomes even more insane if you need to cross accounts, which you’ll probably have to do eventually because of the aforementioned insanity of AWS account/organization best practices.
googlers in their infinite wisdom have built a startup ecosystem for gcp that assigns "startups" to entry level new hires who are scrambling to figure out how to be managers of accounts while learning how to talk to humans, because googlers are generally not used to interacting with humans, just code, and that is the result of the programmatic hiring/screening process. each newly hired 20-something is also assigned 3,000 gcp accounts to manage.
what is the engineering used to determine a weak startup from a growing company you ask? well....googlers again use random numbers not logic (human interaction avoidance firewall) to determine that and set the floor at $30M "publicly declared investment capital". so what happens when you the gcp architect consultant hired to help this successful startup productionalize their gcp infra but their last round was private? google tells the soon to be $100M success company they are not real yet.....so they go get their virtual cpu,ram,disk from aws who knows how to treat customers right by being able to talk to humans by hiring account managers who pick up the phone and invite you to lunch to talk about your successful startup growing on aws. googlers are the biggest risk factor to the far superior gcp infrastructure for any business, startup or fortune 10.
If you spend enough (or they think you'll spend enough), you'll get an account manager without the premium support contract, especially early in the onboarding
If you know what you're doing you don't need AWS support.
We add support when we want to do something new, like MediaTailor + SSAI. At that point we're exploring and trying to get our heads around how things work. Once it works there's no real point in support.
That said, you need to ask your account manager about (1) discounts in exchange for spend commitments, and (2) technical assistance. In general we have a talk with our AM when we're doing something new, and they rope in SMEs from the various products for us.
We're not that big, and I haven't worked for large companies, and it's always been a mystery to me why people have problems dealing with AWS. I've always found them to be super responsive and easy to get ahold of. OTOH we actually know what we're doing technically.
Google Cloud, OTOH, is super fucked up. I mean seriously, I doubt anyone there has any idea WTF is happening or how anything works anymore. There's no real cohesion, or at least there wasn't the last time I was abused by GCP.
I never got this in the comparison of aws between gcp. Why do people need direct support that much?
In 8 years, I had to reach out to GCP maybe twice and still got an answer anyway.
I, too, prefer McDonald's cheeseburgers to ground glass mixed with rusty nails. It's not so much that I love Terraform (spelled OpenTofu) as that it's far and away the least bad tool I've used in the space.
Terraform/openTofu is more than OK. The fact that you can use to to configure your Cisco products as well as AWS is honestly great for us. It's also a bit like ansible: if you don't manage it carefully and try to separate as much as possible early, it starts bloating, so you have to curate early.
Terragrunt is the only sane way to deploy terraform/openTofu in a professional environment though.
You can honestly do a lot of what people do with Terraform now just using Docker and Ansible. I'm surprised more people don't try to. Most clouds are supported, even private clouds and stuff like MAAS.
This is the best post to HN in quite some time. Kudos to the detailed and structured break-down.
If the author had a Ko-Fi they would've just earned $50 USD from me.
I've been thinking of making the leap away from JIRA and I concur on RDS, Terraform for IAC, and FaaS whenever possible. Google support is non-existent and I only recommend GC for pure compute. I hear good things about Big Table, but I've never used in in production.
I disagree on Slack usage aside from the postmortem automation. Slack is just gonna' be messy no matter what policies are put in place.
Anecdotally I've actually had pretty good interactions with GCP including fast turn arounds on bugs that couldn't possibly affect many other customers.
What do you use if not slack? OPs advice is standard best practice. Respect peoples time by not expecting immediate response, and use team or function based channels as much as possible.
Other options are email of course, and what, teams for instant messages?
Goes on to use Kubernetes and entire GitOps stacks to run a process. I truly do wonder what difficulty there is in transferring a binary to the system and writing a system unit file and being done with it.
Much of this matches my own experience. A few thoughts:
1. Cost tracking meetings with your finance team are useful, but for AWS and other services that support it I highly recommend setting billing alarms. The sooner you can know about runaway costs, the sooner you can do something about it.
2. Highly recommend PGAnalyze (https://pganalyze.com/) if you're running Postgres in your stack. It's really intuitive, and has proven itself invaluable many times when debugging issues.
3. Having used Notion for like 7 years now, I don't think I love it as much as I used to. I feel like the "complexity" of documents gets inflated by Notion and the number of tools it gives you, and the experience of just writing text in Notion isn't super smooth IMO.
4. +1 to moving off JIRA. We moved to Shortcut years ago, I know Linear is the new hotness now.
5. I would put Datadog as an "endorse". It's certainly expensive but I feel we get loads of value out of it since we leaned so heavily into it as a central platform.
There's a big difference between runaway costs and these costs over here, which are 10-20% higher than we think makes sense, especially compared to what we're spending over here. Let's spend some time figuring out how to reduce those costs.
You should be doing both - belt and suspenders.
Self managing a database vs getting RDS isn't an easy choice. It depends on the scale, it depends on the industry... if you're locked in already in AWS, the price difference between the bare machines vs RDS usually aren't enough to pay for another person.
If you're starting everything from scratch, you might think that going to other providers (like Hetzner) is a good idea, and it may definitely be! But then you need to set up a Site2Site VPN because the second big customer of your B2B SaaS startup uses on-premises infrastructure and AWS has that out of the box, while you need an expert networking guy to do that the right way on Hetzner.
The last startup I was with that used AWS didn't spend anything on premium support. We were given startup credits to apply to our accounts, and they were always happy to hand out more to get us hooked.
Most startups don’t need a dba, just competent full stack/backend engineers. That being said, I understand why many startups prefer having a dba. Not exactly fun when your only staff engineer likes to just store everything in a jsonb column in Postgres.
I think we're making a mistake by shoving all of this into the cloud rather than building tooling around local agents (worktrees, containers, as mentioned as "difficult" in the post). I think as an industry we just reach for cloud like our predecessors reached for IBM, without critical thought about what's actually the right tool for the job.
If you can manage docker containers in a cloud, you can manage them on your local. Plus you get direct access to your own containers, local filesystems and persistence, locally running processes, quick access for making environmental tweaks or manual changes in tandem with your agents, etc. Not to mention the cost savings.
The thing is that startups often don't have the time or capital to build a data center even though public cloud is just more expensive. If you're bootstrapping a business then it makes sense. My advice would be to always use only those features of the public cloud that you can also use on your private cloud, such as Kubernetes.
>Since the database is used by everyone, it becomes cared for by no one. Startups don’t have the luxury of a DBA, and everything owned by no one is owned by infrastructure eventually.
This post was a great read.
Tangent to this, I've always found "best practices" to be a bit of a misnomer. In most cases in software and especially devops I have found it means "pay for this product that constrains the way that you do things so you don't shoot yourself in the foot". It's not really a "practice" if you're using a product that gives you one way to do something. That said my company uses a very similar tech stack and I would choose the same one if I was starting a company tomorrow, despite the fact that, as others have mentioned, it's a ton to keep in your head all at once.
> In most cases in software and especially devops I have found it means "pay for this product that constrains the way that you do things so you don't shoot yourself in the foot". It's not really a "practice" if you're using a product that gives you one way to do something.
The good thing about a lot of devops saas is that you're not paying anyone on staff to understand the problem domain and guide your team. The bad thing is that you're not paying anyone on staff to understand the problem domain and guide your team.
The SQLite-per-customer pattern mentioned in the database subthread is underrated. I've been running a FastAPI app with a single SQLite database (WAL mode + FTS5) and the operational simplicity is genuinely life-changing compared to managing Postgres.
The key insight: for read-heavy workloads on a single machine, SQLite eliminates the network hop entirely. Response times drop to sub-15ms for full-text search queries. The tradeoff is write concurrency, but if your write volume is low (mine is ~20/day), it's a non-issue.
The one thing I'd add to the article: the biggest infrastructure regret I see is premature complexity. Running Postgres + Redis + a message queue when your app gets 100 requests/day is solving problems you don't have while creating problems you do (operational overhead, debugging distributed state, config drift between environments).
I agree with the sentiment, that most non-decisions are really implicit decisions in disguise. They have implications whether you thought about them up front or not. And if you need to revisit those non-decisions, it will cost you.
But I don't like calling this tech debt. The tech debt concept is about taking on debt explicitly, as in choosing the sub-optimal path on purpose to meet a deadline then promising a "payment plan" to remove the debt in the future. Tech debt implies that you've actually done your homework but picked door number 2 instead. A very explicit choice, and one where decision makers must have skin in the game.
A hurried, implicit choice has none of those characteristics - it's ignorance leading (inevitably?) to novel problems. That doesn't fit the debt metaphor at all. We need to distinguish tech debt from plain old sloppy decision making. Maybe management can even start taking responsibility for decisions instead of shrugging and saying "Tech debt, what can you do, amirite?"
I have seen this post before. I don't know what exactly they do, but that's an extraordinary list of products to be managing. I hope they are making enough revenue to cover those outrageous costs.
Feels like a minor glimpse into what's involved in running tech companies these days. Sure this list could be much simpler, but then so would the scope of the company's offerings. So AI would offer enough accountability to replace all of this? Agents juggling million token contexts? It's kind of hard to wrap my head around.
Agents run tools, too. You can make an LLM count by the means of language processing, but it's much more efficient to let it run a Python script.
By the same token, it's more efficient to let an LLM operate all these tools (and more) than to force an LLM to keep all of that on its "mind", that is, context.
In my experience, it's easier to take schema out into a new DB in the off-chance it makes sense to do so.
The big place I'd disagree with this is when "your" data is actually customer data, and then you want 1 DB per customer whenever you can and SQLite is your BFF here. You have 1 DB for your stuff(accounting, whatever) and then 1 SQLite file per customer, that holds their data. Your customer wants a copy, you run .backup and send the file, easy peasy. They get pissed, rage quit and demand you delete all their data, easy!
Highly recommend reading Designing Data-Intensive Apps [1] and Monolith to Microservices [2]. I can't remember which (maybe both?) but I definitely took away the idea that if services share a DB, that DB's schema is now a public interface and becomes much more difficult to evolve with new requirements.
Coming from a world of acquisitions, I see almost every startup make the same decision of having a single database for everything. Can’t stress enough how big of a problem this becomes once you scale even a little bit. Migrations are expensive and time consuming. And for most teams, moving an application to a different db almost always becomes an urgent need, when they are least able to.
He doesn't want to manage the database the way he manages the rest of his infrastructure. All of his bullet points apply to other components as well, but he's absorbed the cost of managing them and assigning responsibilities.
- Crud accumulates in the [infrastructure thingie], and it’s unclear if it can be deleted.
- When there are performance issues, infrastructure (without deep product knowledge) has to debug the [infrastructure thingie] and figure out who to redirect to
- [infrastructure thingie] users can push bad code that does bad things to the [infrastructure thingie]. These bad things may PagerDuty alert the infrastructure team (since they own the [infrastructure thingie]). It feels bad to wake up one team for another team’s issue. With application owned [infrastructure thingies], the application team is the first responder.
The issue wasn't sharing a database, it was not being clear about who owns what.
Having multiple teams with one code base that has one database is fine. Every every line of code, table and column needs to be owned by exactly ONE team.
Ownership is the most important part of making an organization effective.
What's the DBMS? We moved in the other direction with postgres, merged multiple databases to simply have a schema per service/application instead. All the advantages with none of the disadvantages, imo. (We then had a single database per running test/dev environment, rather than multiple.) Of course, that's a pg thing, if you use MySQL for example it's not an option.
Accept that eventually you'll have multiple databases, so it makes sense to plan from that from the start and get in place the mechanisms for the databases to talk to each other.
The part about account teams for AWS and GCP is very true in my experience. I could tell my AWS account team that I was hungry and they would offer to bring me a bagel in an hour. My GCP account team no-shows our cadence calls and somehow forgets the one question I ask them in the intervening time between our calls, which means each month I get to re-explain the issue as they pretend to escalate it again.
> Regret: Not adopting an identity platform early on. I stuck with Google Workspace at the start...
I've worked with hundreds of customers to integrate IdP's with our application and Google Workspace was by far the worst of the big players (Entra ID, Okta, Ping). Its extremely inflexible for even the most basic SAML configuration. Stay far, far away.
And it's a horrible moat. I've gotten locked out of a Google Workspace permanently because the person who set it up left, used a personal email/phone to do it, and despite us owning/controlling the domain, Google wouldn't unlock admin access to the Workspace for us, they would only delete it. Unacceptable business risk.
I disagree on Kubernetes versus ECS. For me, the reasons to use ECS are not having to pay for a control plane, and not having to keep up with the Kubernetes upgrade treadmill.
As a non infra guy I'll say this. I'm curious about Linear. At my own company I vibecoded my own project management app against the JIRA API because I can't stand our version of JIRA. It's too many clicks, too many things to remember and it's unintuitive.
If you have the power to do so, get rid of JIRA immediately. There are like 10 competitors that are all dramatically better.
I would personally recommend https://www.shortcut.com which is very well designed, and also made some really sensible improvements over the time that we used it.
Baffling piece of software. It's a task manager and every time I use it I flail around for ages trying to figure out how to mark a task completed. No idea why people like it.
As everyone knows JIRA sucks but some perfect implementation of it exists in the ether at some company you will never work at :)
Theses days AI in doc, spec and production lifecycle means we need AI first ticket tooling - haven’t used Linear but I suspect that works far better with AI then JIRA
Been incredibly happy with the speed, featureset, and pace of new (good) features in Linear. Our team has adopted it quite happily and it gets a ton of good use. Can fully recommend.
I initially read this wrong as "Almost every infrastructure decision I make I regret after 4 years", and I nodded my head in agreement.
I've been working mostly at startups most of my career (for Sydney Australia values of "start up" which mostly means "small and new or new-ish business using technology", not the Silicon Valley VC money powered moonshot crapshoot meaning). Two of those roles (including the one I'm in now) have been longer that a decade.
And it's pretty much true that almost all infrastructure (and architecture) decisions are things that 4-5 years later become regrets. Some standouts from 30 years:
I didn't choose Macromind/Macromedia Director in '94 but that was someone else's decision I regretted 5 years later.
I shouldn't have chosen to run a web business on ISP web hosting and Perl4 in '95 (yay /cgi-bin).
I shouldn't have chosen globally colocated desktop pc linux machines and MySQL in '98/99 (although I got a lot of work trips and airline miles out of that).
I shouldn't have chosen Python2 in 2007, or even worse Angular2 in 2011.
I _probably_ shouldn't have chosen Arch Linux (and a custom/bastardised Pacman repo) for a hardware startup in 2013.
I didn't choose Groovy on Grails in 2014 but I regretted being recruited into being responsible for it by 2018 or so.
I shouldn't have chosen Java/MySQL in 2019 (or at least I should have kept a much tighter leash on the backend team and their enterprise architecture astronaut).
The other perspective on all those decisions though, each of them allowed a business to do the things they needed to take money off customers (I know I know, that's not the VC startup way...) Although I regretted each of those later, even in retrospect I think I made decent pragmatic choices at the time. And at this stage of my career I've become happy enough knowing that every decision is probably going to have regrets over a 4 or 5 year timeframe, but that most projects never last long enough for you to get there - either the business doesn't pass out and closes the project down, or a major ground up rewrite happens for reasons often unrelated to 5 year old infrastructure or architecture choices.
I'll switch in Cloudflare Zero Trust for Okta simply for the fact that Cloudflare Access and Tunnels + An identity provider (we use M365) give you so much value (and it's free upto 50 users). It is even better if you are already running DNS on Cloudflare, you can securely deploy access-controlled apps on the internet without too much of a hassle and management. And with the recent addition for Infrastructure for SSH you can securely extend SSH access just as seamlessly.
The Bottlerocket issues really surprise me - not an experience I've shared even with heavy use. I use EKS with Bottlerocket + managed addons + Karpenter, and our security team is super happy that _nobody_ has access to the underlying nodes. Immutable OS is a key selling point, and Brupop "just works" to keep everything up to date without any input. Patching nodes is something I haven't had to think about in almost a year.
Everything in article is excellent point but other big point is schema changes become extremely difficult because you have unknown applications possibly relying on that schema.
It's also at certain point, the database becomes absolutely massive and you will need teams of DBAs care and feeding it.
The things that impact the most are locking/blocking, data duplication (ghosting due to race conditions), and poor performance. The best advice is RTFM the documentation for your database; yes, it is a lot to digest that is why DBAs exist. Most of these foot guns are due to poor architecture. You have to imagine multiple users/processes are literally trying to write to the same record at the same time; when you realize this, a single table with simple key-values is completely inadequate.
Pro: every team probably needs user information, so don’t duplicate it in weird ways with uncertain consistency.
Con: it’s sadly likely that no one on your staff knows a damn thing about how an RDBMS works, and is seemingly incapable of reading documentation, so you’re gonna run into footguns faster. To be fair, this will also happen with isolated DBs, and will then be much more effort to rein in.
They are very similar to the pros and cons of having a monorepo. It encourages information sharing and cross-linkage between related teams. This is simultaneously its biggest pro and its biggest con.
Honestly, this is a reasonable itemization of experience with individual tools, but this reads like a recipe for Company Cake instead of a case-by-case statement of need, selection, and then evaluation. Cargo culting continues to wrap its tendrils around the industry and try to drag it into the depths of mediocrity, and this largely reads to me like a primer for how to saddle yourself with endless SaaS bills. I recognize that every situation has its nuances, but I think approaching running a company from "what tools do you use" is pretty much the biggest possible example of ignoring that maxim.
You will never agree 100% with someone else when it comes to decisions like this, but clearly there is a lot of history behind these decisions and they are a great starting point for conversations internally I think.
I've just look out of curiosity on Appsmith, as the author endorsed this tool as some admin panel builder. I had to double check the name, as right now this is, surprise, surprise, AI powered application builder...
I used to use Replit for educational purposes, to be able to create simple programs in any language and share them with others (teachers, students). That was really useful.
Now Replit is a frontend to some AI chat that is supposed to write software for me.
Is this jumping into AI bandwagon everywhere a new trend? Is this really needed? Is this really profitable?
Just about anyone who aspires to raise capital in the current market is making themselves out to be AI. Give it a couple of years and we'll be onto the next craze. By that time I should have migrated my application off the blockchain into the metaverse.
I see you regret Datadog but there's no alternative - did you end up homebrewing metrics, or are you just living with their insane pricing model? In my experience they suck but not enough to leave.
Currently going through leaving DD at work. Many potential options, many companies trying to break in. The one that calls to me spiritually is: throw it all in Clickhouse (hosted Clickhouse is shockingly cheap) with a hosted HyperDX (logs and metrics UI) instance in front of it. HyperDX has its issues, but it's shocking how cheap it is to toss a couple hundred TB of logs/metrics into Clickhouse per month (compared to the kings ransom DD charges). And you can just query the raw rows, which really comes in handy for understanding some in-the-weeds metrics questions.
VictoriaMetrics stack. Better, cheaper, faster queries, more k8s native, etc.
Easy to run with budget saved from not being on Datadog + attracts smart and observability minded engineers to your team.
"No alternative" isn't quite right anymore, though I understand the feeling. The real problem with Datadog isn't the pricing - it's that their per-host model incentivizes you to care about infrastructure topology rather than user-facing behavior. You end up with 10,000 dashboards and still can't answer "is checkout broken right now?"
The open source stack has gotten genuinely viable: Prometheus/VictoriaMetrics for metrics, Grafana for viz, and OpenTelemetry as the collection layer means you're not locked into anyone's agent. The gap used to be in correlation - connecting a metric spike to a trace to a log line - but that's narrowed significantly.
The actual hard part of leaving DD isn't technical, it's organizational. DD becomes load-bearing for on-call runbooks, alert routing, and team muscle memory. Migration is less "swap the backend" and more "retrain your incident response."
If you're evaluating: the question I'd ask isn't "which vendor has the best dashboards" but "can I get from alert to root cause in under 5 minutes with this tool?" That's the metric that actually correlates with MTTR, and it's where most monitoring setups (including expensive ones) fail.
Pagerduty: They haven't yet hit that point where PD doubles the prices for them. Or they don't have everyone on the platform, it will be their next Datadog (too expensive)
I agree, 2-3x pricing because you have more people always felt like a cash grab, similar to an SSO Tax. We also have a lot of complex Pagerduty configurations and their APIs are painful. Why do timestamps drift, templates get updated but don't show drift, and identifiers are not the same between UI and API. I regret implementing within Terraform and would rather just let teams manage their own on-call sadly.
PagerDuty's pricing trajectory is following the exact same playbook as Datadog. Start cheap enough that teams adopt it without finance approval, then jack up per-seat pricing once it's embedded in every runbook and escalation policy.
The insidious part with on-call tooling specifically is that switching costs are higher than almost any other category. Your escalation chains, schedules, integrations with monitoring, incident templates, post-mortem workflows - it all becomes organizational muscle memory. Migrating monitoring backends is a weekend project compared to migrating on-call routing.
What I've seen work: teams that treat on-call routing as a thin layer rather than a platform. If your schedules live in something portable (even a YAML file synced to whatever tool) and your alert routing is OpenTelemetry-native, swapping the actual dispatch tool becomes manageable. The teams that get locked in are the ones who build their entire incident process inside PD's UI.
Thanks for sharing, really helpful to see your thinking. I haven't fully embraced FaaS myself but never regretted it either.
Curious to hear more about Renovate vs Dependabot. Is it complicated to debug _why_ it's making a choice to upgrade from A to B? Working on a tool to do app-specific breaking change analysis so winning trust and being transparent about what is happening is top of mind.
When were you using quay.io? In the pre-CoreOS years, CoreOS years (2014-2018), or the Red Hat years?
Love it. Excellent reasoning for subjective decisions that don’t knock the product or solution itself as much as, “not what we specifically needed, and that’s okay”.
Bookmarked for my own infrastructure transformations. Honestly, if Okta could spit out a container or appliance that replaces on-prem ADDCs for LDAP, GPOs, and Kerberos, I’d give them all the money. They’re just so good.
FaaS is almost certainly a mistake. I get the appeal from an accountant's perspective, but from a debugging and development perspective it's really fucking awful compared to using a traditional VM. Getting at logs in something like azure functions is a great example of this.
I pushed really hard for FaaS until I had to support it. It's the worst kind of trap. I still get sweaty thinking about some of the issues we had with it.
What's the issue with logging? I would have expected stdout/stderr to get automatically transferred to the providers managed logging solution (e.g. cloudwatch).
Though I never really understood the appeal of FaaS over something like Google-Cloud-Run.
> Getting at logs in something like azure functions is a great example of this.
This is the least of the problems I've experienced with Azure Functions. You'd have to try very hard to NOT end up with useful logs in Application Insights if you use any of the standard Functions project templates. I'm wondering how this went wrong for you?
> “This EC2 instance type running 24/7 at full load is way less expensive than a Lambda running”.
For the same amount of memory they should cost _nearly_ identical. Run the numbers. They're not significantly different services. Aside from this you do NOT pay for IPv4 when using Lambda, you do on EC2, and so Lambda is almost always less expensive.
I'm curious how that plays out when you factor in other infrastructure components like DB and load balancers.
On Lambda, load balancing is handled out of the box but you may need to introduce things like connection poolers for the DB you could have gotten away without on EC2
Think it also depends if you're CPU or memory constrained. Lambda seemed more expensive for CPU heavy workloads since you're stuck with certain CPU:mem ratios and there's more flexibility on EC2 instance types
I feel so many of these. LOL @ GitHub endorse-ish, more -ish every day now. Overall though seems like a pretty good hit rate.
Surprised to see datadog as a regret - it is expensive but it's been enormously useful for us. Though we don't run kubernetes, so perhaps my baseline of expensive is wrong.
Nice but how do those services combine with each others ? How do you combine notion, slack, your git hosting, linear and your CI/CD ? If there are only URLs between each others it’s hard to link all the work together
Using GCP gives me the same feeling as vibe-coded source code. Technically works but deeply unsettling. Unless GCP is somehow saving you boatloads of cash, AWS is much better.
RDS is a very quick way to expand your bill, followed by EC2, followed by S3. RDS for production is great, but you should avoid the bizarre HN trope of "Postgres for everything" with RDS. It makes your database unnecessarily larger which expands your bill. Use it strategically and your cost will remain low while also being very stable and easy to manage. You may still end up DIYing backups. Aurora Serverless v2 is another useful way to reduce bill. If you want to do custom fancy SQL/host/volume things, RDS Custom may enable it.
I'm starting to think Elasticache is a code smell. I see teams adopt it when they literally don't know why they're using it. Similar to the "Postgres for everything" people, they're often wasteful, causing extra cost and introducing more complexity for no benefit. If you decide to use Elasticache, Valkey Serverless is the cheapest option.
Always use ECR in AWS. Even if you have some enterprise artifact manager with container support... run your prod container pulls with ECR. Do not enable container scanning, it just increases your bill, nobody ever looks at the scan results.
I no longer endorse using GitHub Actions except for non-business-critical stuff. I was bullish early on with their Actions ecosystem, but the whole thing is a mess now, from the UX to the docs to the features and stability. I use it for my OSS projects but that's it. Most managed CI/CD sucks. Use Drone.io for free if you're small, use WoodpeckerCI otherwise.
Buying an IP block is a complicated and fraught thing (it may not seem like it, but eventually it is). Buy reserved IPs from AWS, keep them as long as you want, you never have to deal with strange outages from an RIR not getting the correct contact updated in the correct amount of time or some foolishness.
He mentions K8s, and it really is useful, but as a staging and dev environment. For production you run into the risk of insane complexity exploding, and the constant death march of upgrades and compatibility issues from the 12 month EOL; I would not recommend even managed K8s for prod. But for staging/dev, it's fantastic. Give your devs their own namespace (or virtual cluster, ideally) and they can go hog wild deploying infrastructure and testing apps in a protected private environment. You can spin up and down things much easier than typical AWS infra (no need for terraform, just use Helm) with less risk, and with horizontal autoscaling that means it's easier to save money. Compare to the difficulty of least-privilege in AWS IAM to allow experiments; you're constantly risking blowing up real infra.
Helm is a perfectly acceptable way to quickly install K8s components, big libraries of apps out there on https://artifacthub.io/. A big advantage is its atomic rollouts which makes simple deploy/rollback a breeze. But ExternalSecrets is one of the most over-complicated annoying garbage projects I've ever dealt with. It's useful, but I will fight hard to avoid it in future. There are multiple ways to use it with arcane syntax, yet it actually lacks some useful functionality. I spent way too much time trying to get it to do some basic things, and troubleshooting it is difficult. Beware.
I don't see a lot of architectural advice, which is strange. You should start your startup out using all the AWS well-architected framework that could possibly apply to your current startup. That means things like 1) multiple AWS accounts (the more the better) with a management account & security account, 2) identity center SSO, no IAM users for humans, 3) reserved CIDRs for VPCs, 4) transit gateway between accounts, 5) hard-split between stage & prod, 6) openvpn or wireguard proxy on each VPC to get into private networks, 7) tagging and naming standards and everything you build gets the tags, 8) put in management account policies and cloudtrail to enforce limitations on all the accounts, to do things like add default protections and auditing. If you're thinking "well my startup doesn't need that" - only if your startup dies will you not need it, and it will be an absolute nightmare to do it later (ever changed the wheels on a moving bus before?). And if you plan on working for more than one startup in your life, doing it once early on means it's easier the second time. Finally if you think "well that will take too long!", we have AI now, just ask it to do the thing and it'll do it for you.
> Do not enable container scanning, it just increases your bill, nobody ever looks at the scan results.
God I wish that were true. Unfortunately, ECR scanning is often cheaper and easier to start consuming than buying $giant_enterprise_scanner_du_jour, and plenty of people consider free/OSS scanners insufficient.
Stupid self inflicted problems to be sure, but far from “nobody uses ECR scanning”.
Infra guys doing DBA is a nightmare in my experience (usually clueless and it gets loved less than more sexy parts of infra). Devs too
Hire a DBA ASAP. They need to reign in also the laziness of all other developers when designing and interacting with the DB. The horrors a dev can create in the DB can take years to undo
I'm a little afraid to say it but LLMs are getting quite good at query optimization. They can also read slow query logs and use extensions like pg_stat_statements
Doesn't necessarily prevent a terrible schema but it's become a lot easier to fix abomination queries at least
I’m CTO for a startup that was recently acquired for $100M+ I agree with everything in this post apart from Go because I’m just not a big fan of the language.
econner|10 days ago
dangus|10 days ago
GCP’s architecture seems clearly better to me especially if you are looking to be global.
Every organization I’ve ever witnessed eventually ends up with some kind of struggle with AWS’ insane organizations and accounts nightmare.
GCP’s use of folders makes way more sense.
GCP having global VPCs is also potentially a huge benefit if you want your users to hit servers that are physically close to them. On AWS you have to architect your own solution with global accelerator which becomes even more insane if you need to cross accounts, which you’ll probably have to do eventually because of the aforementioned insanity of AWS account/organization best practices.
sandorscribbles|9 days ago
what is the engineering used to determine a weak startup from a growing company you ask? well....googlers again use random numbers not logic (human interaction avoidance firewall) to determine that and set the floor at $30M "publicly declared investment capital". so what happens when you the gcp architect consultant hired to help this successful startup productionalize their gcp infra but their last round was private? google tells the soon to be $100M success company they are not real yet.....so they go get their virtual cpu,ram,disk from aws who knows how to treat customers right by being able to talk to humans by hiring account managers who pick up the phone and invite you to lunch to talk about your successful startup growing on aws. googlers are the biggest risk factor to the far superior gcp infrastructure for any business, startup or fortune 10.
unsnap_biceps|10 days ago
mannyv|9 days ago
We add support when we want to do something new, like MediaTailor + SSAI. At that point we're exploring and trying to get our heads around how things work. Once it works there's no real point in support.
That said, you need to ask your account manager about (1) discounts in exchange for spend commitments, and (2) technical assistance. In general we have a talk with our AM when we're doing something new, and they rope in SMEs from the various products for us.
We're not that big, and I haven't worked for large companies, and it's always been a mystery to me why people have problems dealing with AWS. I've always found them to be super responsive and easy to get ahold of. OTOH we actually know what we're doing technically.
Google Cloud, OTOH, is super fucked up. I mean seriously, I doubt anyone there has any idea WTF is happening or how anything works anymore. There's no real cohesion, or at least there wasn't the last time I was abused by GCP.
h1fra|9 days ago
rco8786|9 days ago
kstrauser|10 days ago
I, too, prefer McDonald's cheeseburgers to ground glass mixed with rusty nails. It's not so much that I love Terraform (spelled OpenTofu) as that it's far and away the least bad tool I've used in the space.
orwin|10 days ago
Terragrunt is the only sane way to deploy terraform/openTofu in a professional environment though.
walt_grata|10 days ago
nine_k|10 days ago
easterncalculus|10 days ago
MrDarcy|10 days ago
calmbonsai|10 days ago
If the author had a Ko-Fi they would've just earned $50 USD from me.
I've been thinking of making the leap away from JIRA and I concur on RDS, Terraform for IAC, and FaaS whenever possible. Google support is non-existent and I only recommend GC for pure compute. I hear good things about Big Table, but I've never used in in production.
I disagree on Slack usage aside from the postmortem automation. Slack is just gonna' be messy no matter what policies are put in place.
xyzzy_plugh|10 days ago
unethical_ban|10 days ago
Other options are email of course, and what, teams for instant messages?
notyourwork|10 days ago
SoftTalker|10 days ago
That made me laugh. Yes I get that they probably didn't use all of these at the same time.
movedx|9 days ago
Goes on to use Kubernetes and entire GitOps stacks to run a process. I truly do wonder what difficulty there is in transferring a binary to the system and writing a system unit file and being done with it.
Hovertruck|9 days ago
1. Cost tracking meetings with your finance team are useful, but for AWS and other services that support it I highly recommend setting billing alarms. The sooner you can know about runaway costs, the sooner you can do something about it.
2. Highly recommend PGAnalyze (https://pganalyze.com/) if you're running Postgres in your stack. It's really intuitive, and has proven itself invaluable many times when debugging issues.
3. Having used Notion for like 7 years now, I don't think I love it as much as I used to. I feel like the "complexity" of documents gets inflated by Notion and the number of tools it gives you, and the experience of just writing text in Notion isn't super smooth IMO.
4. +1 to moving off JIRA. We moved to Shortcut years ago, I know Linear is the new hotness now.
5. I would put Datadog as an "endorse". It's certainly expensive but I feel we get loads of value out of it since we leaned so heavily into it as a central platform.
bwilliams18|9 days ago
rf15|9 days ago
but... you are spending so much on AWS and premium support... surely you can afford that
Lucasoato|9 days ago
If you're starting everything from scratch, you might think that going to other providers (like Hetzner) is a good idea, and it may definitely be! But then you need to set up a Site2Site VPN because the second big customer of your B2B SaaS startup uses on-premises infrastructure and AWS has that out of the box, while you need an expert networking guy to do that the right way on Hetzner.
happymellon|9 days ago
darth_avocado|9 days ago
zie|9 days ago
winrid|9 days ago
enviwje97|9 days ago
[deleted]
rco8786|9 days ago
If you can manage docker containers in a cloud, you can manage them on your local. Plus you get direct access to your own containers, local filesystems and persistence, locally running processes, quick access for making environmental tweaks or manual changes in tandem with your agents, etc. Not to mention the cost savings.
cryptonector|9 days ago
b40d-48b2-979e|9 days ago
kolja005|10 days ago
This post was a great read.
Tangent to this, I've always found "best practices" to be a bit of a misnomer. In most cases in software and especially devops I have found it means "pay for this product that constrains the way that you do things so you don't shoot yourself in the foot". It's not really a "practice" if you're using a product that gives you one way to do something. That said my company uses a very similar tech stack and I would choose the same one if I was starting a company tomorrow, despite the fact that, as others have mentioned, it's a ton to keep in your head all at once.
dogleash|9 days ago
The good thing about a lot of devops saas is that you're not paying anyone on staff to understand the problem domain and guide your team. The bad thing is that you're not paying anyone on staff to understand the problem domain and guide your team.
unknown|10 days ago
[deleted]
indiestack|9 days ago
The key insight: for read-heavy workloads on a single machine, SQLite eliminates the network hop entirely. Response times drop to sub-15ms for full-text search queries. The tradeoff is write concurrency, but if your write volume is low (mine is ~20/day), it's a non-issue.
The one thing I'd add to the article: the biggest infrastructure regret I see is premature complexity. Running Postgres + Redis + a message queue when your app gets 100 requests/day is solving problems you don't have while creating problems you do (operational overhead, debugging distributed state, config drift between environments).
yakkomajuri|9 days ago
This is an important point.
perrygeo|9 days ago
But I don't like calling this tech debt. The tech debt concept is about taking on debt explicitly, as in choosing the sub-optimal path on purpose to meet a deadline then promising a "payment plan" to remove the debt in the future. Tech debt implies that you've actually done your homework but picked door number 2 instead. A very explicit choice, and one where decision makers must have skin in the game.
A hurried, implicit choice has none of those characteristics - it's ignorance leading (inevitably?) to novel problems. That doesn't fit the debt metaphor at all. We need to distinguish tech debt from plain old sloppy decision making. Maybe management can even start taking responsibility for decisions instead of shrugging and saying "Tech debt, what can you do, amirite?"
joshdick|9 days ago
Whoa, now there is a truth bomb. I've seen this happen a bunch, but never put it this succinctly before.
spprashant|9 days ago
kaycey2022|10 days ago
nine_k|10 days ago
By the same token, it's more efficient to let an LLM operate all these tools (and more) than to force an LLM to keep all of that on its "mind", that is, context.
consumer451|9 days ago
> Regret
Thanks for this data point. I am currently trying to make this call, and I was still on the fence. This has tipped me to the separate db side.
Can anyone else share their experience with this decision?
[0] https://cep.dev/posts/every-infrastructure-decision-i-endors...
zie|9 days ago
In my experience, it's easier to take schema out into a new DB in the off-chance it makes sense to do so.
The big place I'd disagree with this is when "your" data is actually customer data, and then you want 1 DB per customer whenever you can and SQLite is your BFF here. You have 1 DB for your stuff(accounting, whatever) and then 1 SQLite file per customer, that holds their data. Your customer wants a copy, you run .backup and send the file, easy peasy. They get pissed, rage quit and demand you delete all their data, easy!
austinsharp|9 days ago
[1] https://www.amazon.com/Designing-Data-Intensive-Applications... [2] https://www.amazon.com/Monolith-Microservices-Evolutionary-P...
darth_avocado|9 days ago
kwillets|9 days ago
- Crud accumulates in the [infrastructure thingie], and it’s unclear if it can be deleted.
- When there are performance issues, infrastructure (without deep product knowledge) has to debug the [infrastructure thingie] and figure out who to redirect to
- [infrastructure thingie] users can push bad code that does bad things to the [infrastructure thingie]. These bad things may PagerDuty alert the infrastructure team (since they own the [infrastructure thingie]). It feels bad to wake up one team for another team’s issue. With application owned [infrastructure thingies], the application team is the first responder.
JamesBarney|9 days ago
Having multiple teams with one code base that has one database is fine. Every every line of code, table and column needs to be owned by exactly ONE team.
Ownership is the most important part of making an organization effective.
OJFord|9 days ago
intrasight|9 days ago
zmj|9 days ago
sylens|9 days ago
wavemode|10 days ago
past discussion: https://news.ycombinator.com/item?id=39313623
tomhow|10 days ago
Almost every infrastructure decision I endorse or regret - https://news.ycombinator.com/item?id=39313623 - Feb 2024 (626 comments)
Meetvelde|9 days ago
neo_doom|10 days ago
I've worked with hundreds of customers to integrate IdP's with our application and Google Workspace was by far the worst of the big players (Entra ID, Okta, Ping). Its extremely inflexible for even the most basic SAML configuration. Stay far, far away.
0xbadcafebee|10 days ago
nevalainen|10 days ago
mwcampbell|10 days ago
x3n0ph3n3|10 days ago
mettamage|9 days ago
nicoburns|9 days ago
I would personally recommend https://www.shortcut.com which is very well designed, and also made some really sensible improvements over the time that we used it.
phrotoma|9 days ago
AIorNot|9 days ago
Theses days AI in doc, spec and production lifecycle means we need AI first ticket tooling - haven’t used Linear but I suspect that works far better with AI then JIRA
ubercore|9 days ago
bigiain|10 days ago
I've been working mostly at startups most of my career (for Sydney Australia values of "start up" which mostly means "small and new or new-ish business using technology", not the Silicon Valley VC money powered moonshot crapshoot meaning). Two of those roles (including the one I'm in now) have been longer that a decade.
And it's pretty much true that almost all infrastructure (and architecture) decisions are things that 4-5 years later become regrets. Some standouts from 30 years:
I didn't choose Macromind/Macromedia Director in '94 but that was someone else's decision I regretted 5 years later.
I shouldn't have chosen to run a web business on ISP web hosting and Perl4 in '95 (yay /cgi-bin).
I shouldn't have chosen globally colocated desktop pc linux machines and MySQL in '98/99 (although I got a lot of work trips and airline miles out of that).
I shouldn't have chosen Python2 in 2007, or even worse Angular2 in 2011.
I _probably_ shouldn't have chosen Arch Linux (and a custom/bastardised Pacman repo) for a hardware startup in 2013.
I didn't choose Groovy on Grails in 2014 but I regretted being recruited into being responsible for it by 2018 or so.
I shouldn't have chosen Java/MySQL in 2019 (or at least I should have kept a much tighter leash on the backend team and their enterprise architecture astronaut).
The other perspective on all those decisions though, each of them allowed a business to do the things they needed to take money off customers (I know I know, that's not the VC startup way...) Although I regretted each of those later, even in retrospect I think I made decent pragmatic choices at the time. And at this stage of my career I've become happy enough knowing that every decision is probably going to have regrets over a 4 or 5 year timeframe, but that most projects never last long enough for you to get there - either the business doesn't pass out and closes the project down, or a major ground up rewrite happens for reasons often unrelated to 5 year old infrastructure or architecture choices.
arush15june|9 days ago
stroebs|9 days ago
lightyrs|10 days ago
hambes|10 days ago
1: https://kubernetes.io/blog/2026/01/29/ingress-nginx-statemen...
Grimburger|10 days ago
Knative on k8s works well for us, there's some oddities about it but in general does the job
zem|10 days ago
stackskipton|10 days ago
Everything in article is excellent point but other big point is schema changes become extremely difficult because you have unknown applications possibly relying on that schema.
It's also at certain point, the database becomes absolutely massive and you will need teams of DBAs care and feeding it.
fidgetstick|10 days ago
https://www.enterpriseintegrationpatterns.com/patterns/messa...
rawgabbit|10 days ago
sgarland|10 days ago
Con: it’s sadly likely that no one on your staff knows a damn thing about how an RDBMS works, and is seemingly incapable of reading documentation, so you’re gonna run into footguns faster. To be fair, this will also happen with isolated DBs, and will then be much more effort to rein in.
brandmeyer|10 days ago
esoterae|9 days ago
jbmsf|10 days ago
I also reached a lot of similar decisions and challenges, even where we differ (ECS vs EKS) I completely understand your conclusions.
jmward01|10 days ago
piokoch|9 days ago
I used to use Replit for educational purposes, to be able to create simple programs in any language and share them with others (teachers, students). That was really useful.
Now Replit is a frontend to some AI chat that is supposed to write software for me.
Is this jumping into AI bandwagon everywhere a new trend? Is this really needed? Is this really profitable?
hare2eternity|9 days ago
jrjeksjd8d|10 days ago
lelandbatey|10 days ago
jpgvm|10 days ago
stackskipton|10 days ago
velocity3230|10 days ago
jamiemallers|9 days ago
The open source stack has gotten genuinely viable: Prometheus/VictoriaMetrics for metrics, Grafana for viz, and OpenTelemetry as the collection layer means you're not locked into anyone's agent. The gap used to be in correlation - connecting a metric spike to a trace to a log line - but that's narrowed significantly.
The actual hard part of leaving DD isn't technical, it's organizational. DD becomes load-bearing for on-call runbooks, alert routing, and team muscle memory. Migration is less "swap the backend" and more "retrain your incident response."
If you're evaluating: the question I'd ask isn't "which vendor has the best dashboards" but "can I get from alert to root cause in under 5 minutes with this tool?" That's the metric that actually correlates with MTTR, and it's where most monitoring setups (including expensive ones) fail.
mlrtime|9 days ago
TechIsCool|9 days ago
jamiemallers|9 days ago
The insidious part with on-call tooling specifically is that switching costs are higher than almost any other category. Your escalation chains, schedules, integrations with monitoring, incident templates, post-mortem workflows - it all becomes organizational muscle memory. Migrating monitoring backends is a weekend project compared to migrating on-call routing.
What I've seen work: teams that treat on-call routing as a thin layer rather than a platform. If your schedules live in something portable (even a YAML file synced to whatever tool) and your alert routing is OpenTelemetry-native, swapping the actual dispatch tool becomes manageable. The teams that get locked in are the ones who build their entire incident process inside PD's UI.
robszumski|10 days ago
Curious to hear more about Renovate vs Dependabot. Is it complicated to debug _why_ it's making a choice to upgrade from A to B? Working on a tool to do app-specific breaking change analysis so winning trust and being transparent about what is happening is top of mind.
When were you using quay.io? In the pre-CoreOS years, CoreOS years (2014-2018), or the Red Hat years?
artyom|9 days ago
This is a classic. I'd say that for every company, big or small, ends up taking the #1 spot on technical debt.
allanbreyes|9 days ago
[1]: https://martinfowler.com/bliki/IntegrationDatabase.html
stego-tech|9 days ago
Bookmarked for my own infrastructure transformations. Honestly, if Okta could spit out a container or appliance that replaces on-prem ADDCs for LDAP, GPOs, and Kerberos, I’d give them all the money. They’re just so good.
AIorNot|9 days ago
YetAnotherNick|9 days ago
bob1029|9 days ago
FaaS is almost certainly a mistake. I get the appeal from an accountant's perspective, but from a debugging and development perspective it's really fucking awful compared to using a traditional VM. Getting at logs in something like azure functions is a great example of this.
I pushed really hard for FaaS until I had to support it. It's the worst kind of trap. I still get sweaty thinking about some of the issues we had with it.
CodesInChaos|9 days ago
Though I never really understood the appeal of FaaS over something like Google-Cloud-Run.
antonyt|9 days ago
This is the least of the problems I've experienced with Azure Functions. You'd have to try very hard to NOT end up with useful logs in Application Insights if you use any of the standard Functions project templates. I'm wondering how this went wrong for you?
themafia|10 days ago
For the same amount of memory they should cost _nearly_ identical. Run the numbers. They're not significantly different services. Aside from this you do NOT pay for IPv4 when using Lambda, you do on EC2, and so Lambda is almost always less expensive.
nijave|9 days ago
On Lambda, load balancing is handled out of the box but you may need to introduce things like connection poolers for the DB you could have gotten away without on EC2
Think it also depends if you're CPU or memory constrained. Lambda seemed more expensive for CPU heavy workloads since you're stuck with certain CPU:mem ratios and there's more flexibility on EC2 instance types
prplfsh|9 days ago
Surprised to see datadog as a regret - it is expensive but it's been enormously useful for us. Though we don't run kubernetes, so perhaps my baseline of expensive is wrong.
ink_13|10 days ago
Just FYI article is two years old
ttoinou|10 days ago
thundergolfer|9 days ago
modal.com exists now
isoprophlex|9 days ago
I love modal. I think they got FaaS for GPU exactly right, both in terms of their SDK and the abstractions/infra they provide.
ungreased0675|9 days ago
It seems excessive and expensive. Is this what most startups are doing these days?
herpdyderp|9 days ago
weedhopper|10 days ago
0xbadcafebee|10 days ago
RDS is a very quick way to expand your bill, followed by EC2, followed by S3. RDS for production is great, but you should avoid the bizarre HN trope of "Postgres for everything" with RDS. It makes your database unnecessarily larger which expands your bill. Use it strategically and your cost will remain low while also being very stable and easy to manage. You may still end up DIYing backups. Aurora Serverless v2 is another useful way to reduce bill. If you want to do custom fancy SQL/host/volume things, RDS Custom may enable it.
I'm starting to think Elasticache is a code smell. I see teams adopt it when they literally don't know why they're using it. Similar to the "Postgres for everything" people, they're often wasteful, causing extra cost and introducing more complexity for no benefit. If you decide to use Elasticache, Valkey Serverless is the cheapest option.
Always use ECR in AWS. Even if you have some enterprise artifact manager with container support... run your prod container pulls with ECR. Do not enable container scanning, it just increases your bill, nobody ever looks at the scan results.
I no longer endorse using GitHub Actions except for non-business-critical stuff. I was bullish early on with their Actions ecosystem, but the whole thing is a mess now, from the UX to the docs to the features and stability. I use it for my OSS projects but that's it. Most managed CI/CD sucks. Use Drone.io for free if you're small, use WoodpeckerCI otherwise.
Buying an IP block is a complicated and fraught thing (it may not seem like it, but eventually it is). Buy reserved IPs from AWS, keep them as long as you want, you never have to deal with strange outages from an RIR not getting the correct contact updated in the correct amount of time or some foolishness.
He mentions K8s, and it really is useful, but as a staging and dev environment. For production you run into the risk of insane complexity exploding, and the constant death march of upgrades and compatibility issues from the 12 month EOL; I would not recommend even managed K8s for prod. But for staging/dev, it's fantastic. Give your devs their own namespace (or virtual cluster, ideally) and they can go hog wild deploying infrastructure and testing apps in a protected private environment. You can spin up and down things much easier than typical AWS infra (no need for terraform, just use Helm) with less risk, and with horizontal autoscaling that means it's easier to save money. Compare to the difficulty of least-privilege in AWS IAM to allow experiments; you're constantly risking blowing up real infra.
Helm is a perfectly acceptable way to quickly install K8s components, big libraries of apps out there on https://artifacthub.io/. A big advantage is its atomic rollouts which makes simple deploy/rollback a breeze. But ExternalSecrets is one of the most over-complicated annoying garbage projects I've ever dealt with. It's useful, but I will fight hard to avoid it in future. There are multiple ways to use it with arcane syntax, yet it actually lacks some useful functionality. I spent way too much time trying to get it to do some basic things, and troubleshooting it is difficult. Beware.
I don't see a lot of architectural advice, which is strange. You should start your startup out using all the AWS well-architected framework that could possibly apply to your current startup. That means things like 1) multiple AWS accounts (the more the better) with a management account & security account, 2) identity center SSO, no IAM users for humans, 3) reserved CIDRs for VPCs, 4) transit gateway between accounts, 5) hard-split between stage & prod, 6) openvpn or wireguard proxy on each VPC to get into private networks, 7) tagging and naming standards and everything you build gets the tags, 8) put in management account policies and cloudtrail to enforce limitations on all the accounts, to do things like add default protections and auditing. If you're thinking "well my startup doesn't need that" - only if your startup dies will you not need it, and it will be an absolute nightmare to do it later (ever changed the wheels on a moving bus before?). And if you plan on working for more than one startup in your life, doing it once early on means it's easier the second time. Finally if you think "well that will take too long!", we have AI now, just ask it to do the thing and it'll do it for you.
zbentley|10 days ago
God I wish that were true. Unfortunately, ECR scanning is often cheaper and easier to start consuming than buying $giant_enterprise_scanner_du_jour, and plenty of people consider free/OSS scanners insufficient.
Stupid self inflicted problems to be sure, but far from “nobody uses ECR scanning”.
findalex|9 days ago
dwedge|9 days ago
dangoodmanUT|10 days ago
modal.com???
bfeynman|9 days ago
gib444|10 days ago
Hire a DBA ASAP. They need to reign in also the laziness of all other developers when designing and interacting with the DB. The horrors a dev can create in the DB can take years to undo
nijave|9 days ago
Doesn't necessarily prevent a terrible schema but it's become a lot easier to fix abomination queries at least
phrotoma|9 days ago
gnarbarian|10 days ago
shockwaverider|9 days ago
rixed|10 days ago
mlrtime|9 days ago
In which world does a large tech company exist without problems, if so how big, how many customers etc?
mads_quist|9 days ago
rewilder12|9 days ago
MaXtreeM|10 days ago
Adexintart|9 days ago
[deleted]
ohyoutravel|9 days ago