top | item 29002377

(no title)

andydb | 4 years ago

To any who might see this, I'm the author of the blog post, and led the engineering team that built CockroachDB Serverless. I'll be monitoring this thread in case there are any questions you'd like to ask me about it.

discuss

order

vegasje|4 years ago

I'm quite confused by Request Units, and trying to predict how many would be used by queries/operations.

I launched a test cluster, and the RUs are continuously increasing without me having connected to the cluster yet. At this rate of RU climb, the cluster would use over 8mil of the available 10mil RUs in a month without me touching it.

Coming from AWS, one of the most difficult aspects of using Aurora is guessing how many I/Os will be used for different workloads. It would be a shame to introduce this complexity for CockroachDB Serverless, especially if the RUs are impacted by internal cluster operations that aren't initiated by the user.

andydb|4 years ago

You've run into a "rough edge" of the beta release that will be fixed soon. When you keep the Cluster Overview page open, it runs queries against your cluster so that it can display information like "# databases in the cluster". Unfortunately, those queries run every 10 seconds in the background, and are consuming RUs, which is why you see RU usage without having connected to the cluster yet. But never fear, we'll get that fixed.

One thing that may not be clear - you get 10M RUs for free, up front, but you also get a constant accumulation of 100 RU/s for free throughout the month. That adds up to >250M free RUs per month. This ensures that your cluster is always accessible, and that you never truly "run out" of RUs - at most you get throttled to 100 RU/s.

I hear you on the difficulty of understanding how your queries map to RUs. SQL queries can be enormously complex and differ by multiple orders of magnitude from one another in terms of their compute cost. That's why we built a near real-time dashboard that shows you how quickly you're consuming RUs. You can run your workload for a few minutes and then check back on the dashboard to see how many RUs that workload consumed.

sebbul|4 years ago

I haven't connected to my cluster but my RUs keep going up. Extrapolating I'll be at 20M RUs over 30 days without using it.

yla92|4 years ago

> Each node runs in its own K8s pod, which is not much more than a Docker container with a virtualized network and a bounded CPU and memory capacity. Dig down deeper, and you’ll discover a Linux cgroup that can reliably limit the CPU and memory consumption for the processes. This allows us to easily meter and limit SQL resource consumption on a per-tenant basis.

Nice use of K8s here and overall a great post! This is not related to CockroachDB but related to kube and cgroup. I am wondering if you guys have faced this infamous CPU throttling issue[0] when you guys were doing the metering and limiting.

[0] : https://github.com/kubernetes/kubernetes/issues/67577

andydb|4 years ago

We haven't run into that so far, but thank you for pointing it out as something to watch out for.

jawns|4 years ago

What's your elevator pitch for why my organization should use CockroachDB Serverless vs. something like AWS Aurora Serverless, particularly if we're already relatively invested in the AWS ecosystem?

qaq|4 years ago

Not CDB employee but CDB scales beyond what Aurora can support.

andydb|4 years ago

Oh boy, I'm an engineer, but I'll do my best to pretend I'm on the sales or marketing team for a minute...

First of all, CockroachDB Serverless is available on AWS, and should integrate quite well with that ecosystem, including with Serverless functions offered by AWS Lambda.

Here are a few advantages of CockroachDB Serverless that Aurora will struggle to match (note that we're still working on Serverless multi-region support):

1. Free-forever tier. We offer a generous "free forever" tier that doesn't end after a month or a year. As the blog post outlines, our architecture is custom-built to make this economical.

2. No ceiling on write scalability. Even non-Serverless Aurora runs into increasing trouble as the number of writes / second increases past what a single machine can handle. CockroachDB just keeps going. We've had multiple high-scale customers who hit Aurora limits and had to move over to Cockroach to support business growth.

3. True multi-region support. Aurora only allows read-only, stale replicas in other regions, while CRDB allows full ACID SQL transactions. If you want to move into other regions of the world and have latency concerns or GDPR concerns, CRDB is custom-built to make the full SQL experience possible.

4. No Cloud lock-in. Perhaps this is not a concern for you company, but many companies don't like getting completely locked in to a single Cloud provider. CockroachDB works on multiple cloud providers and doesn't have a monetary interest in locking you in to just one.

5. Online schema changes. CockroachDB supports operations like adding/removing columns, renaming tables, and adding constraints without any downtime. You can perform arbitrary schema changes without disturbing your running application workloads. SQL DDL "just works".

6. Cold start in an instant. CockroachDB clusters automatically "scale to zero" when they're not in use. When traffic arrives, they resume in a fraction of a second. Compare that to Aurora, where you need to either have a minimum compute reservation, or you need to endure multi-second cold starts.

7. Great support. We've got a friendly Slack room where you can get free support and rub shoulders with fellow CockroachDB users, as well as CockroachDB folks like myself. We also have 24/7 paid support for deeper problems you might encounter.

Taken altogether, CockroachDB can go wherever your business needs it to go, without all the constraints that traditional SQL databases usually have. Do you want thousands of clusters for testing/development/tiny apps at a reasonable cost? Could your business take off and need the scale that CRDB offers? Could your business need to expand into multiple geographic regions? Are some of your workloads erratic or periodic, but still should start up instantly when needed? It's not just about what you need now, but what you may need in the future. It makes sense to plan ahead and go with a database that has "got you covered" wherever you need to go.

dilyevsky|4 years ago

Haven’t yet got time to read the whole thing so sorry if it’s already answered but is it possible to run this sql pod/storage pod separation setup yourself with crdb community/enterprise? We run enterprise crdb but it’s all in one process (with replicas)

andydb|4 years ago

It's not currently possible, partly because it complicates the deployment model quite a bit. Dynamically bringing SQL pods up and down requires a technology like Kubernetes. It takes some serious operational know-how to keep it running smoothly, which is why we thought it would be perfect for a managed Cloud service.

What would be your company's reasons for wanting this available in self-hosted CRDB? What kinds of use cases would it address for you?

nerdywordy|4 years ago

Can this CRDB Serverless offering handle the burst connections of a serverless function based app? Are pooling or query queueing features built in?

Or would users face connection limits at some upper bound until the old function connections get spun down?

amitkgupta84|4 years ago

How will you handle PrivateLink and VPC Peering connections into customer VPCs/Vnets with the multitenant architecture?

andydb|4 years ago

We're still discussing options there, so I don't have an answer on how we might handle them. We do recognize that many business customers will have such requirements.

chillfox|4 years ago

How low can CocroachDB go with resource usage?

Is the open source version viable for small hobby projects?

justsomeuser|4 years ago

Can I connect using an SSH tunnel (without the SSL cert)?

andydb|4 years ago

At this point, only Postgres SSL connections are supported.

pier25|4 years ago

How does this serverless offering compare to Fauna?

reilly3000|4 years ago

I would assume the main point is that it’s actual PostgreSQL, not a new query language.