> There is some truth to it. This, however, isn’t due (too much) by how you write the code from a syntax perspective: while coding my Python program I noticed that when the function is run in the context of Lambda, the platform expects to pass a couple of parameters to the function (“event” and “context”). I had to tweak my original code to include those two inputs (even though I make no use of them in my program).
This is just a product of bad design
A better approach would be to decouple all your business logic and provide an interface to it.
That way if you need to move away from AWS Lambda, you can simply remove MyLambdaRequestHandler and any associated unit tests and your application code is unaffected.
Yep, exactly. We're currently running the exact same code on both AWS Lambda and IronWorker, all that differs is a simple handler file.
We're looking to add support for Microsoft's new Azure Functions and Google Cloud Functions, and this will be a matter of creating a single file for each to handle the input.
You should always abstract your dependencies, especially if it's a critical part of your infrastructure.
I personally hate stored procedures because they invite bad design. The code for them rarely makes it into source control, and even when it does there are often contextual parts of the schema that aren't included.
If you can do them properly (i.e. source control everything, proper schema update scripts, etc) they can work ok. Just hope you marketing guys don't want to do any A/B testing of various algorithms.
Stored procedures, IMO, are a case where DBAs should be pushing their pain down to the software engineering teams. I think a lot of this delineation will disappear as cross-functional product teams become more integrated across the industry, however.
Perhaps, rather, a product of an infelicitous programming language? You don't have to worry about extra function arguments in javascript.
It seems common in C, python, and similarly arity-concerned languages to just accept a pointer to a struct (dict, etc.) when coding a callback for an API like this. That way the callback can examine as much or as little of the struct as it wants, and much less needs to change in future when the callback needs more.
Usually Massimo's stuff is quite well thought out and practical when it comes to adopting new models in the Enterprise, but this article has a "I'm annoyed by this platform because it doesn't work how I want it to" sentiment throughout. He seems to miss a lot of the potential value that the serverless architecture model brings to the table.
It's pretty well established that Amazon's usual approach is to provide a "toolbox" of services that can used to build any number of app architecture permutations, without all the typical fluff and polish expected by large business customers. I for one, as an engineer, appreciate this model since it's much more accessible and lightweight.
> He seems to miss a lot of the potential value that the serverless architecture model brings to the table.
Its a code evaluation platform, running containerized under the hood, and with a large markup by Amazon. Companies have offered this for years before AWS, but AWS is tooting the horn louder than people have before.
Is it useful? Yeah, sure. Is it revolutionary? Oh come on now.
Just like "the cloud" is just timeshare on someone else's computer, this is simply code management and execution abstracted a few more layers up.
@snockerton thanks for the kind comments. No that was NOT (by any means) how I have approached the post. Arguably the attempt of humor in the article and the picking of a misleading title may have misled the readers to think I was dismissive with the technology. But that wasn't the intent. I do find Lambda (and Serverless in general) very intriguing and I agree on the potentials.
Sorry, yes it has been down for a while. My basic hosting isn't sized to handle north of 8000 hits in a couple of hours :) (I did not expect this much as on average my site has a few hundreds hits per day).
I don't understand why Amazon doesn't launch a clone of App Engine. This is the holy grail in NoOps. AWS Lambda + API Gateway is a bureaucratic exercise in pain, true to AWS form, while App Engine is a pleasure to use. Anyone who have used both services will understand me.
It seems to me (from reading HN) like most people are avoiding using App Engine because Google tortured and killed their pet, err, I mean cancelled Google Reader three years ago.
(And I guess, the inference is that because Google killed a tiny unprofitable service once in the past, you can not realistically depend on them to continue to provide services they are actually putting real money into because it is of strategical interest. Yeah, that makes total brogrammer sense, let's go with that line of thinking.)
I've been using App Engine for many projects since 2008. Like any tool, it's not the right tool for every job, but for its intended use cases, it's great. Being able to click Deploy, and not have to deal with sys admin is really convenient.
The one area where App Engine falls a bit short is lack of support for some widely used libraries like Numpy. It would be nice if The Google would add support for those (support for some transcoding libraries would be nice as well). Even better would be an interface to TensorFlow.
All these years of progress and a site can't be up when it's on HN?
I worked with Lambda a lot, piping the JSON input into Go programs and I cannot be more happy with something.
I work with Go just like I am used to, testing, compiling, CI and everything and then, I have a shell script that deploys it to Lambda (uploads the zip).
On that tangent: I've always personally wondered why social bookmarking sites (Reddit, HN, etc.) don't cache/mirror/archive the sites people link to through them. The original content site's ads and analytics and such are usually client-side JS anyway, so they'd still get dynamically served even on the mirrored page. The original author would still make money from the traffic; they'd just avoid having to pay for the concomitant DDoS.
I didn't find much in the author's post that I agreed with. In particular, a few of his points stuck out to me:
> In traditional PaaS world the code is the indisputable protagonist (oh, damn, and you also happen to need a persistent data service to store those transactions BTW).
> With Lambda the data is the indisputable protagonist (oh and you also happen to attach code to it to build some logic around data).
> A few years of advancement and we are back to stored procedures.
I don't agree at all. Lambda is code. It's code that is invoked upon receiving certain kinds of data, but in no way is the Lambda code subservient to the data, or (unlike stored procedures) even colocated with it. Lambda still needs persistent data services to play with persistent data -- there's nothing magic about it.
> There also have been a lot of discussions as of late re the risk of being locked-in by abusing Serverless architectures (like Lambda).
> There is some truth to it. This, however, isn’t due (too much) by how you write the code from a syntax perspective: while coding my Python program I noticed that when the function is run in the context of Lambda, the platform expects to pass a couple of parameters to the function (“event” and “context”). I had to tweak my original code to include those two inputs (even though I make no use of them in my program).
The author describes having to write a controller method. This is not surprising, considering he was trying to make a web service.
> So, IMO, the lock-in will not be a function of how different the syntax in your code will be Vs. running it on a platform you control (probably minimally different) but rather in how scattered and interleaved with other services your code will be (at scale).
This is the most cogent point in the article. API Gateway + AWS Lambda can be used to create micro-microservices. Serverless Framework tries to wrangle this potential complexity by allowing users to group related lambdas/endpoints as a whole, but there is still the opportunity to create a real rat's nest of logic if we're not careful.
> P.S. Yes, I know that it’s called “Serverless” but it doesn’t mean “there are no servers involved”. Are we really discussing this?
Yeah, obviously servers are involved. But the fact that I don't have to care about those servers nearly as much (in terms of maintenance or in terms of up-front cost -- both are big wins for most customers) is worth discussing.
There are different approaches that make sense for different use cases. Is your use case a big database that you just run some commands on top of? Lambda seems great in that case! Just build a front-end client, have it interface directly with Lambda, and you've got a pretty quickly developed app.
Is it the right architecture for everything? Hell no. System architects are in higher demand than ever because there are so many freaking ways to build technology products these days, and it's their job to figure out what tech is right for the job.
Not there yet. Stored procedures are seamlessly integrated with underlying database. There are some hurdles with that for Lambda.
But the critical point is that Lambda is distributed and massively scalable, and stored procedures weren't.
Remember that company, Sun? The one that invented Java? When it was still alive, it has an unofficial motto, "the net is the computer". Nobody understood it then; now we know.
All computing science achievements will now be reproduced in distributed environment. OS? Check (AWS/DCOS/Kubernetes). Filesystems? Check (IPFS). IPC? Check (REST/Websocket). Perhaps even "drivers" will be a new thing (for IoT devices).
Lambda for node devs at the moment is terrible, the version of node is now 13 months old and with the pace of node development it sucks AWS can't keep up even to major versions :( right now I can't use lambda .
Honestly I think a lot of the frustration is because Lambda is not a fully developed platform yet. I decided to use it for a new product we're building at Cronitor and it's not a travesty but I wouldn't have used Lambda if I had it to do over again.
They give you decent primitives: immutable versions, aliases, easy logging. But everything else you have to build yourself: You have to figure out a development loop, deploys, configuration management. There is nothing built-in to help coordinate lambda deploys across regions.
I expect that you'll see this built out more this year.
> You have to figure out a development loop, deploys, configuration management. There is nothing built-in to help coordinate lambda deploys across regions.
Lambda is very limited due to the very short (5 min CPU time) and limited amount of disk space (500 MB /tmp). It's always been my thought the reason for this is that Amazon is effectively running lambda functions on unused (but possibly reserved) hosts, such that they can easily be shut down and don't consume a lot of disk space or do anything so no one will notice (or even tell) the slight performance hiccup.
Lambda has a number of use cases, if you need something to be running for more that 5 minutes with lots of disk space, you're probably better off running on EC2 (or docker containers via ECS), it's designed for short lived, stateless computations.
There's nothing fundamentally wrong with stored procedures, per se. What was wrong with SQL RDBMS stored procedures was that:
1. each DBMS had its own stored-procedure programming language—and so application frameworks that wanted to provide compatibility with the generic idea of a "relational database" couldn't really use them unless they had devs on their staff familiar with each-and-every DB†;
2. there was—and basically still is—no concept of an RDBMS stored-procedure "view" or "schema"—i.e., API versioning for stored procedures, where a client can request to communicate with the set of stored procedures it was compiled to support, rather than the single version the database is holding onto today;
3. one major RDBMS (MySQL) never supported stored procedures at all, so many devs learned an ossified set of "web development best-practices" without ever being exposed to the idea of stored procedures as an option.
All of these issues are fixable. #3 is just a historical artifact of MySQL's laziness; #1 is likewise an artifact of the proprietary, "enterprise lock-in" nature of the first instances of stored procedures (Microsoft's and Oracle's), evenly fracturing the ecosystem away from adopting either. Neither is likely to repeat.
#2 is more pernicious, and to this day seems ill-addressed.
One place I've seen at least an attempt to resolve it is in the design of Redis's Lua queries, where the "solution" is to refer to the stored procedures by content-hash, with the database having an always-possible error case that requires clients to be able to fall back to inserting the stored procedure again (thus necessitating that all clients track their own copies of any stored procedures they want to call.)
Such a solution could be ported to other RDBMSes; I could imagine Postgres, for example, having a "database view" concept††, where real databases only contain raw tables and indexes, and all the view definitions, triggers, stored procedures, constraints, and even typedefs are held in some record/spec/document that can be both manipulated as data, and connected to as a database. This is sort of equivalent to the CouchDB 'design document' concept.
---
† Sadly, it's really just a syntax problem. If you had several DBMSes that all had the same syntax but different extenional semantics, it'd be very easy to write a single code-generator into your application framework that would spit out appropriate code to take advantage of the extensions available. That's how regular ORM SQL-generation works, after all. But when you have disparate syntaxes, suddenly you need disparate code-generators, which get out of sync and lose features (or an LLVM-like intermediate-representation that you can do the semantic-optimization steps to before finally doing the codegen step, but I can't imagine that'd be cheap enough to slot into a webapp's hot loop.)
†† To go all the way with such a concept, the real 'data' of the database—the tables and the indexes—can become floating objects, not contained "in" anything or defined anywhere, merely existing because of a ref-count from various vDB schemas. You wouldn't explicitly define tables; instead, you'd define your views (relational projections) and then assert identity relationships between some of the columns of those projections, causing one "table" to exist holding the underlying data for both views. This is, AFAIK, what https://en.wikipedia.org/wiki/Dabble_DB was working toward.
- Depending on database, the dev tool environment can be extremely limited
- Depending on database, debugging can be a nightmare
- Difficult, if not impossible, to scale over more than one node
- Another version dependency problem added
- If you want to sell your software, or services depending on an application that uses stored procedures, you need to be very careful how you manage licenses.
not sure about the other versions but working with Java, Lambda is was quite a terrible experience (at least if you needed to include a bunch of jars for your work).
[+] [-] djhworld|10 years ago|reply
This is just a product of bad design
A better approach would be to decouple all your business logic and provide an interface to it.
e.g.
That way if you need to move away from AWS Lambda, you can simply remove MyLambdaRequestHandler and any associated unit tests and your application code is unaffected.[+] [-] dchesterton|10 years ago|reply
We're looking to add support for Microsoft's new Azure Functions and Google Cloud Functions, and this will be a matter of creating a single file for each to handle the input.
You should always abstract your dependencies, especially if it's a critical part of your infrastructure.
[+] [-] yeukhon|10 years ago|reply
[+] [-] exelius|10 years ago|reply
If you can do them properly (i.e. source control everything, proper schema update scripts, etc) they can work ok. Just hope you marketing guys don't want to do any A/B testing of various algorithms.
Stored procedures, IMO, are a case where DBAs should be pushing their pain down to the software engineering teams. I think a lot of this delineation will disappear as cross-functional product teams become more integrated across the industry, however.
[+] [-] jessaustin|10 years ago|reply
Perhaps, rather, a product of an infelicitous programming language? You don't have to worry about extra function arguments in javascript.
It seems common in C, python, and similarly arity-concerned languages to just accept a pointer to a struct (dict, etc.) when coding a callback for an API like this. That way the callback can examine as much or as little of the struct as it wants, and much less needs to change in future when the callback needs more.
[+] [-] snockerton|10 years ago|reply
It's pretty well established that Amazon's usual approach is to provide a "toolbox" of services that can used to build any number of app architecture permutations, without all the typical fluff and polish expected by large business customers. I for one, as an engineer, appreciate this model since it's much more accessible and lightweight.
[+] [-] toomuchtodo|10 years ago|reply
Its a code evaluation platform, running containerized under the hood, and with a large markup by Amazon. Companies have offered this for years before AWS, but AWS is tooting the horn louder than people have before.
Is it useful? Yeah, sure. Is it revolutionary? Oh come on now.
Just like "the cloud" is just timeshare on someone else's computer, this is simply code management and execution abstracted a few more layers up.
EDIT: Sorry I'm not on the hype train folks.
[+] [-] mreferre|10 years ago|reply
[+] [-] Mizza|10 years ago|reply
One way to avoid that is to use web frameworks based on AWS Lambda, such as Zappa - https://github.com/Miserlou/Zappa
[+] [-] pweissbrod|10 years ago|reply
[+] [-] mreferre|10 years ago|reply
[+] [-] johansch|10 years ago|reply
It seems to me (from reading HN) like most people are avoiding using App Engine because Google tortured and killed their pet, err, I mean cancelled Google Reader three years ago.
(And I guess, the inference is that because Google killed a tiny unprofitable service once in the past, you can not realistically depend on them to continue to provide services they are actually putting real money into because it is of strategical interest. Yeah, that makes total brogrammer sense, let's go with that line of thinking.)
[+] [-] brianmcconnell|10 years ago|reply
The one area where App Engine falls a bit short is lack of support for some widely used libraries like Numpy. It would be nice if The Google would add support for those (support for some transcoding libraries would be nice as well). Even better would be an interface to TensorFlow.
[+] [-] jessaustin|10 years ago|reply
[+] [-] yeukhon|10 years ago|reply
[+] [-] avitzurel|10 years ago|reply
I worked with Lambda a lot, piping the JSON input into Go programs and I cannot be more happy with something.
I work with Go just like I am used to, testing, compiling, CI and everything and then, I have a shell script that deploys it to Lambda (uploads the zip).
For what I need, lambda is absolutely great!
[+] [-] derefr|10 years ago|reply
[+] [-] Pyxl101|10 years ago|reply
[+] [-] gamache|10 years ago|reply
> In traditional PaaS world the code is the indisputable protagonist (oh, damn, and you also happen to need a persistent data service to store those transactions BTW). > With Lambda the data is the indisputable protagonist (oh and you also happen to attach code to it to build some logic around data). > A few years of advancement and we are back to stored procedures.
I don't agree at all. Lambda is code. It's code that is invoked upon receiving certain kinds of data, but in no way is the Lambda code subservient to the data, or (unlike stored procedures) even colocated with it. Lambda still needs persistent data services to play with persistent data -- there's nothing magic about it.
> There also have been a lot of discussions as of late re the risk of being locked-in by abusing Serverless architectures (like Lambda). > There is some truth to it. This, however, isn’t due (too much) by how you write the code from a syntax perspective: while coding my Python program I noticed that when the function is run in the context of Lambda, the platform expects to pass a couple of parameters to the function (“event” and “context”). I had to tweak my original code to include those two inputs (even though I make no use of them in my program).
The author describes having to write a controller method. This is not surprising, considering he was trying to make a web service.
> So, IMO, the lock-in will not be a function of how different the syntax in your code will be Vs. running it on a platform you control (probably minimally different) but rather in how scattered and interleaved with other services your code will be (at scale).
This is the most cogent point in the article. API Gateway + AWS Lambda can be used to create micro-microservices. Serverless Framework tries to wrangle this potential complexity by allowing users to group related lambdas/endpoints as a whole, but there is still the opportunity to create a real rat's nest of logic if we're not careful.
> P.S. Yes, I know that it’s called “Serverless” but it doesn’t mean “there are no servers involved”. Are we really discussing this?
Yeah, obviously servers are involved. But the fact that I don't have to care about those servers nearly as much (in terms of maintenance or in terms of up-front cost -- both are big wins for most customers) is worth discussing.
[+] [-] exelius|10 years ago|reply
Is it the right architecture for everything? Hell no. System architects are in higher demand than ever because there are so many freaking ways to build technology products these days, and it's their job to figure out what tech is right for the job.
[+] [-] gamache|10 years ago|reply
[+] [-] atemerev|10 years ago|reply
But the critical point is that Lambda is distributed and massively scalable, and stored procedures weren't.
Remember that company, Sun? The one that invented Java? When it was still alive, it has an unofficial motto, "the net is the computer". Nobody understood it then; now we know.
All computing science achievements will now be reproduced in distributed environment. OS? Check (AWS/DCOS/Kubernetes). Filesystems? Check (IPFS). IPC? Check (REST/Websocket). Perhaps even "drivers" will be a new thing (for IoT devices).
[+] [-] philliphaydon|10 years ago|reply
[+] [-] singlow|10 years ago|reply
[+] [-] encoderer|10 years ago|reply
They give you decent primitives: immutable versions, aliases, easy logging. But everything else you have to build yourself: You have to figure out a development loop, deploys, configuration management. There is nothing built-in to help coordinate lambda deploys across regions.
I expect that you'll see this built out more this year.
[+] [-] spdustin|10 years ago|reply
https://github.com/serverless/serverless solves (upon my initial read-through) every one of your requirements, including multi-region deploys.
[+] [-] unknown|10 years ago|reply
[deleted]
[+] [-] mangeletti|10 years ago|reply
Is this sentence was written in double speak or did I just have a stroke?
[+] [-] mreferre|10 years ago|reply
Corrected. Thanks.
[+] [-] iamleppert|10 years ago|reply
[+] [-] djhworld|10 years ago|reply
[+] [-] derefr|10 years ago|reply
There's nothing fundamentally wrong with stored procedures, per se. What was wrong with SQL RDBMS stored procedures was that:
1. each DBMS had its own stored-procedure programming language—and so application frameworks that wanted to provide compatibility with the generic idea of a "relational database" couldn't really use them unless they had devs on their staff familiar with each-and-every DB†;
2. there was—and basically still is—no concept of an RDBMS stored-procedure "view" or "schema"—i.e., API versioning for stored procedures, where a client can request to communicate with the set of stored procedures it was compiled to support, rather than the single version the database is holding onto today;
3. one major RDBMS (MySQL) never supported stored procedures at all, so many devs learned an ossified set of "web development best-practices" without ever being exposed to the idea of stored procedures as an option.
All of these issues are fixable. #3 is just a historical artifact of MySQL's laziness; #1 is likewise an artifact of the proprietary, "enterprise lock-in" nature of the first instances of stored procedures (Microsoft's and Oracle's), evenly fracturing the ecosystem away from adopting either. Neither is likely to repeat.
#2 is more pernicious, and to this day seems ill-addressed.
One place I've seen at least an attempt to resolve it is in the design of Redis's Lua queries, where the "solution" is to refer to the stored procedures by content-hash, with the database having an always-possible error case that requires clients to be able to fall back to inserting the stored procedure again (thus necessitating that all clients track their own copies of any stored procedures they want to call.)
Such a solution could be ported to other RDBMSes; I could imagine Postgres, for example, having a "database view" concept††, where real databases only contain raw tables and indexes, and all the view definitions, triggers, stored procedures, constraints, and even typedefs are held in some record/spec/document that can be both manipulated as data, and connected to as a database. This is sort of equivalent to the CouchDB 'design document' concept.
---
† Sadly, it's really just a syntax problem. If you had several DBMSes that all had the same syntax but different extenional semantics, it'd be very easy to write a single code-generator into your application framework that would spit out appropriate code to take advantage of the extensions available. That's how regular ORM SQL-generation works, after all. But when you have disparate syntaxes, suddenly you need disparate code-generators, which get out of sync and lose features (or an LLVM-like intermediate-representation that you can do the semantic-optimization steps to before finally doing the codegen step, but I can't imagine that'd be cheap enough to slot into a webapp's hot loop.)
†† To go all the way with such a concept, the real 'data' of the database—the tables and the indexes—can become floating objects, not contained "in" anything or defined anywhere, merely existing because of a ref-count from various vDB schemas. You wouldn't explicitly define tables; instead, you'd define your views (relational projections) and then assert identity relationships between some of the columns of those projections, causing one "table" to exist holding the underlying data for both views. This is, AFAIK, what https://en.wikipedia.org/wiki/Dabble_DB was working toward.
[+] [-] testrun|10 years ago|reply
- Depending on database, the dev tool environment can be extremely limited
- Depending on database, debugging can be a nightmare
- Difficult, if not impossible, to scale over more than one node
- Another version dependency problem added
- If you want to sell your software, or services depending on an application that uses stored procedures, you need to be very careful how you manage licenses.
[+] [-] jmlucjav|10 years ago|reply
[+] [-] djhworld|10 years ago|reply
If you need to include a bunch of jars, using Maven + the maven shade plugin (or assembly plugin) to generate a 'fat jar' is very simple.
In fact, their official documentation states exactly this http://docs.aws.amazon.com/lambda/latest/dg/java-create-jar-...