Data API for Amazon Aurora Serverless

coderecipe|6 years ago

With this, VPC is no longer needed from lambda call to RDS, and this means that cold start time will be lowered from seconds to milliseconds. I made a ready to use recipe (source code+deployment script+demo included) here https://coderecipe.ai/architectures/77374273 hopefully this help others to easily onboard to this new API.

scarface74|6 years ago

This only works for Aurora Serverless, not regular Aurora or any other managed databases.

etaioinshrdlu|6 years ago

I told my AWS account manager today that this is what I wanted to see on Aurora Serverless:

- mysql 5.7 compatibility

- acting as replication master or slave

- faster upscaling, more likes 5s instead of 30s

- publicly accessible over internet (the rest of RDS has this)

- aurora parallel query built in

- aurora multi master built in

Basically, I asked for one product to merge all their interesting features. That sounds nice and like a one-size-fits all database. I would very much like to use it in production. It would require very little maintenance.

hn_throwaway_99|6 years ago

I wonder what effect this may have for AWS Lambdas connecting to a DB for synchronous calls (e.g. through API gateway). The biggest issue with Lambdas IMO is the cold start time. If your Lambda is in a VPC the cold start time is around 8-10 seconds, and if you have decent security practices your database will be in a VPC. I know AWS said they would be working on improving Lambda VPC cold start times, but would like to know if using Aurora Serverless with these kind of "connectionless connections" would also get rid of the need to be in a VPC. I've used Aurora (and really, really liked it) but I haven't used Aurora Serverless.

ftcHn|6 years ago

Would it "get rid of the need to be in a VPC"? I think yes.

It looks like by enabling Data API, you expose that endpoint to the entire internet - which is secured like all the other AWS services with HTTPS, IAM, etc.

coderecipe|6 years ago

Feel free to take a look at my code sample here with this new Data API on Aurora Serverless https://coderecipe.ai/architectures/77374273, demo and source code included. It removes the need of a vpc and it works like a charm

davmar|6 years ago

came here wondering the same thing. those cold start times aren't acceptable for user-facing apps and i had to switch a serverless project from RDS in a VPC to dynamodb. even worse is that these cold start times are for each lambda, so if you've got concurrent usage then each new lambda spin up causes a cold start.

having said that, i'm actually pretty happy with dynamodb...so far.

whoevercares|6 years ago

[deleted]

unknown|6 years ago

[deleted]

cavisne|6 years ago

Another cool thing about this is it avoids the connection pool issue with Lambda (where concurrent requests cant reuse connections).

Aurora is already pretty good at handling a lot of connections but this is even better.

keysmasher|6 years ago

Not really, it was solved without this API given aurora serverless would manage connections and scaling automatically (https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide...).

But the real problem was connection time was unusable for any client facing application. I tried it after it was released (not preview). I really doubt this API would respond any faster.

djhworld|6 years ago

You can create a connection pool in a static context that lives throughout the lifetime of the JVM.

Although admittedly if Lambda scales to multiple JVMs as request rate increases, you'll have multiple pools. Or if your request rate is low you'll not get much benefit

tienshiao|6 years ago

The beta version seemed like it had pretty poor performance: https://www.jeremydaly.com/aurora-serverless-data-api-a-firs...

Does anyone have performance feedback now that it is no longer beta?

reilly3000|6 years ago

I'm definitely excited about this, especially after paying $36/month for a NAT that I barely used for a long, long time, and spending too many hours configuring it for my Lambdas.

That said, I don't know how Jeremy Daly got away with making that post, per AWS preview terms. They are pretty explicit about not posting benchmarks on their preview products, and that makes sense as the API is not stable at all.

Still, I'm glad to see the data and hope that the performance has improved. I wasn't accepted into the preview, and I've started work now to move most of our infrastructure to GCP. It notably does not require any fancy footwork to have a Cloud Function talk to a Cloud SQL instance https://cloud.google.com/functions/docs/sql#overview

mattnguyen|6 years ago

Jeremy has updated the post in response to the announcement.

- Lots of improvements & better documentation

- Smaller response size, but can be cut down a lot more

- Sub 100ms query performance

> I’m really impressed by the updates that have been made. I do want to reiterate that this isn’t an easy problem to solve, so I think the strides they’ve made are quite good. I’m not sure how connection management works under the hood, so I’ll likely need to experiment with that a bit to measure concurrent connection performance.

edit: formatting

blaisio|6 years ago

... Don't you have to establish an HTTPS connection to use this API? Is that really easier than using the existing MySQL protocol? Or is it really so horrible that HTTPS is faster?

Things establishing new connections will never be as fast as things reusing existing connections. It seems wasteful to ignore this.

smt88|6 years ago

This appears to be targeted at Lambda function, which can't reuse existing connections between executions.

Also, establishing an HTTP connection is much faster than establishing a typical database connection, in my experience. I don't know why that is.

tybit|6 years ago

Unlike a properly configured RDS cluster, this is available outside of your VPC on the open internet.

That’s the main selling point to me, though the connection pooling of MYSQL connections by the HTTPS proxy is also nice too.

joemag|6 years ago

Unless I’m misunderstanding the question, many http clients would pool http(s) connections. Most of them do that by default. So connection establishment cost gets amortized over large number of API calls.

unknown|6 years ago

[deleted]

19 comments