google vp here: we appreciate the feedback! i generally agree that if you have a strong understanding of your static capacity needs, pre-provisioning VMs is likely to be more cost efficient with today's pricing. cloud run GPUs are ideal for more bursty workloads -- maybe a new AI app that doesn't yet have PMF, where you really need that scale-to-zero + fast start for more sparse traffic patterns.
jakecodes|9 months ago
I’m fairly experienced with GCP, but even then, the billing model here caught me off guard. When you’re dealing with machines that can run up to $64K/month, small missteps get expensive quickly. Predictability is key, and I’d love to see more safeguards or clearer cost modeling tooling around these types of workloads.
gabe_monroy|9 months ago
ashishb|9 months ago
Indeed. IIRC, if you get a single request every 15 mins (~100 requests a day), you will pay for Cloud Run GPU for the full day.
krembo|9 months ago
mgraczyk|9 months ago
Sn0wCoder|9 months ago
ashishb|9 months ago
So if you get all your requests in a 2 hours window then that's great. It will scale to zero for rest of the 22 hours.
However, if you get at least one request every 15 mins then you will pay for 24 hours and it is ~3X more expensive then equivalent VM on Google Cloud.