FWIW, I've found that building a robust and deep "API Gateway" is the key to making SOA/Microservices work. Otherwise, you end up with duplication and latency.
Routing and authentication are obvious candidates. It's also a good place to track stats and tag each request with a unique ID so you can trace it as it flows through your services.
By "deep", I mean that it should be application-aware. Caching is a good example. For many applications, url + querystring results in too many permutations. If the cache is aware, it can often use more meaningful keys. Additionally, events in the underlying services can be used to cache purge, which can result in a wicked cache hit ratio.
A more complex example has to do with duplication. Say you're building an ecomm platform. You have a service for search, and one for recommendations and one for the main catalog. They all need to return the same representation of a "product". Do you duplicate the logic? Do you tie them together and pay the latency and reliability price? No. Have them all just return IDS, and let the API Gateway hydrate the actual results. It's a form of API-aware server-side include and it works well.
The article seems to describe a system which has funnel-shaped dataflow (narrow inlet, spread to services), which is also a side-effect of the 'API Gateway' approach you mention.
Three useful definitions[1] for microservices are:
1) collaborating
2) independently deployable (Martin Fowler [2])
3) globally-aware
Having an API Gateway makes them follow none of these meaningfully, as:
1) the services are collaborated _upon_ rather than doing so with each other (see Clump[3], not itself a bad idea)
2) updating a microservice will many times require a redeploy of the API gateway and for everyone to update their connections to it
3) any functionality you put in the gateway becomes encapsulated in a way tailored to application-global needs, which is pretty inimical to modular components, so you'd better not talk through the gateway except where you absolutely must!
Having orchestration/aggregation services as buro9 mentions in a sibling post is one good solution, satisfying all of those three points much more cleanly.
The danger arises when you start to think of your gateway(s) as layers in front of your system rather than just more microservices in it.
1. Core/Technical services that are the master record for data and are responsible for data integrity, notifying others of changes, audit, etc... but ultimately store the data and the schemas of a piece of data, handle caching.
2. Composition/Orchestration/Business services that sit in front of the core services, and these contain business logic and when they operate on lists they really operate on IDs and do that ESI merge thing composing their collection from multiple calls from the core service.
The more I work on services the more I find this separation and layering of services extremely useful. It helps massively simplify the services that interact with data stores, and allows for quick development of new composition services to handle new needs of the business.
> FWIW, I've found that building a robust and deep "API Gateway" is the key to making SOA/Microservices work. Otherwise, you end up with duplication and latency.
The risk here is that the API gateway gets too big, and its gravity starts attracting more and more functionality, until you have a brand new monolithic application.
I like the gateway approach, but generally I want to have multiple gateways, each serving as a "Microservice Facade" for a particular application or small set of applications. New applications get new Facades.
There ends up being some duplication across the facades, but basically I think "so be it". As long as the core services are factored out and own their business logic, I don't mind if the consuming applications have some duplication.
So here's a question we've been talking about at my office. When developing a micro-service on your development machine, do you need to run the whole stack or just the service you're working on?
For example, let's I am working on service A, which depends on services B and C. Do I need to run all 3 apps and their data stores locally?
We currently will typically point A to the staging B and C. However, we have some long running jobs that A will initiate on B and B needs to post back to A when it's finished. This doesn't work when pointing to staging B.
One option is a framework[1] which tries to start dependent services locally.
A stopgap solution is to use something like an Actor[2] model, which schedules actors on an ActorSystem and to clone the context/scheduler/system for each client ID. As long as your actors are fairly sane, this should be fairly lightweight. Then just shut down the actors for a given client (actorSystem.shutdown() under Akka), either after a time or by having a client send a Shutdown message (or both).
Depends on the use case. For automated testing, if I'm developing service A, then locally I'd usually I'd want to stub B & C with something like Mountebank (http://www.mbtest.org/) (though I've never used it to do a post-back as you describe...not sure if it supports that out of the box).
If I just wanted to poke around a running system to see how things interact manually, yeah I'd run everything locally. Probably using Docker images for each dependency service and Vagrant to manage the suite of those images, to preserve my sanity.
Can't speak for the author, but I can tell you what we do in our company, which is also completely microservice-based.
Backstory: We used to have a helper tool that allowed a developer to run any app locally. It tried to set up the same stack as we were using in production: HAproxy, Nginx, Ruby, Node, PostgreSQL.
It was problematic, because people had machines that differed slightly: Different versions of OS X (or Linux), Homebrew (some used MacPorts), Ruby, Postgres, etc.
We could have spent a lot of time on a script that normalized everything and verified that the correct versions of everything was installed, but the problem with developing such a tool is that you won't know what's going to break until you hire a new developer who needs to bootstrap his box. Or until the next OS X release, or something like that.
Syncing everything with the production environment was also difficult. The way we configure our apps, a lot of the environment (list of database servers, Memcached instances, RabbitMQ hosts, logging etc.) is injected. So with this system we'd have to duplicate the injection: Once in Puppet (for production), a second time on the developer boxes.
So we decided pretty early on to mirate to Vagrant.
---
We now run the whole stack on a Linux VM using Vagrant, configured with the same Puppet configuration we use for our production servers.
The Vagrant box is configured from the exact same Puppet configuration that we use for our production and staging clusters. The Puppet config has a minimal set of variables/declarations that customize the environment that need be tweaked for the environment. From the point of view of the Puppet config, it's just another cluster.
We periodically produce a new Vagrant box with updates whenever there are new apps or new system services. Updating the box is a matter of booting a new clean box and packaging it; Puppet takes care of all the setup. We plan on automating the box builds at some point.
To make the workflow as painless as possible, we have internal "all-round monkey wrench" tool for everything a developer needs to interact with both the VM and our clusters, such as for fetching and installing a new box (we don't use Vagrant Cloud). One big benefit of using Vagrant is that this internal tool can treat it as just another cluster. The same commands we use to interact with prod/staging — to deploy a new app version, for example — are used to interact with the VM.
One notable configuration change we need for Vagrant is a special DNS server. Our little tool modifies the local machine (this is super easy on OS X) and tells it to use the VM's DNS server to resolve ".dev". The VM then runs dnsmasq, which resolves "*.dev" into its own IP. We also have an external .com domain that resolves to the internal IP, for things like Google's OAuth that requires a public endpoint. All the apps that run on the VM then respond to various hosts ending with .dev.
Another important configuration change is support for hot code reloading. This bit of magic has two parts:
- First, we use Vagrant shared folders to allow the developer to selectively "mount" a local application on the VM; when you deploy an app this way, instead of deploying from a Git repo, it simply uses the mounted folder, allowing you to run the app with your local code that you're editing.
- Secondly, when apps run on the VM, they have some extra code automatically injected by the deployment runtime that enables hot code reloading. For Node.js backends, we use a watch system that simply crashes the app on file changes; for the front end stuff, we simply swap out statically-built assets with dynamic endpoints for Browserify and SASS to build the assets every time the app asks for them (with incremental rebuilding, of course). For Ruby backends, we use Sinatra's reloader.
---
Overall, we are very happy with the Vagrant solution. The only major pain point we have faced is not really technical: It's been hard for developers to understand exactly how the box works. Every aspect of the stack needs to be documented so that developers can know where the look and what levers to pull when an app won't deploy properly or a queue isn't being processed correctly. Without this information, the box seems like black magic to some developers, especially those with limited experience with administrating Linux.
We also sometimes struggle with bugs in Vagrant or Virtualbox. For example, sometimes networking stops working, and DNS lookups fail. Or the VM dies when your machine resumes from sleep [1]. Or the Virtualbox file system corrupts files that it reads [2]. Or Virtualbox suddenly sits there consuming 100% CPU for no particular reason. It happens about once every week, so we're considering migrating to VMware.
Another option is to actually give people the option of running their VM in the cloud, such as on Digital Ocean. I haven't investigated how much work this would be. The downside would obviously be that it requires an Internet connection. The benefit would be that you could run much larger, faster VMs, and since they'd have public IPs you could easily share your VM and your current work with other people. Another benefit: They could automatically update from Puppet. The boxes we build today are configured once from Puppet, and then Puppet is excised entirely from the VM. Migrating to a new box version can be a little painful since you lose any test data you had in your old box.
As for your question about what services to run: It's a good question. Right now we only build a single box running everything, even though we have a few different front-end apps that people work on that all use the same stack. We'll probably split this into multiple boxes at some point, as memory usage is starting to get quite heavy. But since all the apps share 90% of the same backend microservices, the difference between the boxes are mostly going to be which front-end apps they run.
It really depends on the size of your micro services. If the services are small, then running each one (via some master shell script) shouldn't be an issue. If the services are large, then something like Vagrant would enable you to get a near-live environment fired up quickly and easily.
Wouldn't blue-green deployment and immutable servers make this a non-issue? Or to put it differently, you probably shouldn't have a staging environment.
1.) How are you handling auth? Are you using a home grown solution or using OpenID Connect + OAuth 2.0?
2.) Is the JWT behind the firewall using a pre-shared key?
3.) What does the public token look like and how does the API Gateway perform auth? Does the token passed into the API Gateway contain only a user id? And does the API Gateway have to perform a database query to populate the full user object?
Hi Tom, I too have a django monolith. But, I hesitate to go down the microservices route, since I reuse alot of classes in what would become different services. Can you comment on how your class structure has changed, and how you have maximized (or not) code reuse?
What sort of classes do you mean? Views? Models? Other? For us it hasn't changed much. Our apps were pretty self-contained so that splitting them into separate services isn't very arduous.
Stuff that is shared between apps is often related to 3rd party integrations, which could be moved into a separate (often asynchronous) worker/ service. In reality most of these design choices are done on a case by case basis, based on time/ cost/ maintenance.
> The services are considered to be in a trusted network and are accessed by a private token passed in the ‘Authorization' header plus the user id of the requester in an ‘X-USER’ header.
This reads like the user ID is exposed in a header without any sort of encryption.
latch|11 years ago
Routing and authentication are obvious candidates. It's also a good place to track stats and tag each request with a unique ID so you can trace it as it flows through your services.
By "deep", I mean that it should be application-aware. Caching is a good example. For many applications, url + querystring results in too many permutations. If the cache is aware, it can often use more meaningful keys. Additionally, events in the underlying services can be used to cache purge, which can result in a wicked cache hit ratio.
A more complex example has to do with duplication. Say you're building an ecomm platform. You have a service for search, and one for recommendations and one for the main catalog. They all need to return the same representation of a "product". Do you duplicate the logic? Do you tie them together and pay the latency and reliability price? No. Have them all just return IDS, and let the API Gateway hydrate the actual results. It's a form of API-aware server-side include and it works well.
tekacs|11 years ago
Three useful definitions[1] for microservices are:
1) collaborating
2) independently deployable (Martin Fowler [2])
3) globally-aware
Having an API Gateway makes them follow none of these meaningfully, as:
1) the services are collaborated _upon_ rather than doing so with each other (see Clump[3], not itself a bad idea)
2) updating a microservice will many times require a redeploy of the API gateway and for everyone to update their connections to it
3) any functionality you put in the gateway becomes encapsulated in a way tailored to application-global needs, which is pretty inimical to modular components, so you'd better not talk through the gateway except where you absolutely must!
Having orchestration/aggregation services as buro9 mentions in a sibling post is one good solution, satisfying all of those three points much more cleanly.
The danger arises when you start to think of your gateway(s) as layers in front of your system rather than just more microservices in it.
[1]: (context) https://vimeo.com/118895501#t=48s
[2]: http://martinfowler.com/articles/microservices.html
[3]: http://getclump.io/
buro9|11 years ago
I consider services to fall into two groups:
1. Core/Technical services that are the master record for data and are responsible for data integrity, notifying others of changes, audit, etc... but ultimately store the data and the schemas of a piece of data, handle caching.
2. Composition/Orchestration/Business services that sit in front of the core services, and these contain business logic and when they operate on lists they really operate on IDs and do that ESI merge thing composing their collection from multiple calls from the core service.
The more I work on services the more I find this separation and layering of services extremely useful. It helps massively simplify the services that interact with data stores, and allows for quick development of new composition services to handle new needs of the business.
jasonwocky|11 years ago
The risk here is that the API gateway gets too big, and its gravity starts attracting more and more functionality, until you have a brand new monolithic application.
I like the gateway approach, but generally I want to have multiple gateways, each serving as a "Microservice Facade" for a particular application or small set of applications. New applications get new Facades.
There ends up being some duplication across the facades, but basically I think "so be it". As long as the core services are factored out and own their business logic, I don't mind if the consuming applications have some duplication.
jamiesoncj|11 years ago
AndrewHampton|11 years ago
For example, let's I am working on service A, which depends on services B and C. Do I need to run all 3 apps and their data stores locally?
We currently will typically point A to the staging B and C. However, we have some long running jobs that A will initiate on B and B needs to post back to A when it's finished. This doesn't work when pointing to staging B.
rgarcia|11 years ago
It lets us spin up a service + all dependent services locally with a single command.
tekacs|11 years ago
A stopgap solution is to use something like an Actor[2] model, which schedules actors on an ActorSystem and to clone the context/scheduler/system for each client ID. As long as your actors are fairly sane, this should be fairly lightweight. Then just shut down the actors for a given client (actorSystem.shutdown() under Akka), either after a time or by having a client send a Shutdown message (or both).
[1]: http://wym.io/ [2]: http://akka.io/
jasonwocky|11 years ago
If I just wanted to poke around a running system to see how things interact manually, yeah I'd run everything locally. Probably using Docker images for each dependency service and Vagrant to manage the suite of those images, to preserve my sanity.
lobster_johnson|11 years ago
Backstory: We used to have a helper tool that allowed a developer to run any app locally. It tried to set up the same stack as we were using in production: HAproxy, Nginx, Ruby, Node, PostgreSQL.
It was problematic, because people had machines that differed slightly: Different versions of OS X (or Linux), Homebrew (some used MacPorts), Ruby, Postgres, etc.
We could have spent a lot of time on a script that normalized everything and verified that the correct versions of everything was installed, but the problem with developing such a tool is that you won't know what's going to break until you hire a new developer who needs to bootstrap his box. Or until the next OS X release, or something like that.
Syncing everything with the production environment was also difficult. The way we configure our apps, a lot of the environment (list of database servers, Memcached instances, RabbitMQ hosts, logging etc.) is injected. So with this system we'd have to duplicate the injection: Once in Puppet (for production), a second time on the developer boxes.
So we decided pretty early on to mirate to Vagrant.
---
We now run the whole stack on a Linux VM using Vagrant, configured with the same Puppet configuration we use for our production servers.
The Vagrant box is configured from the exact same Puppet configuration that we use for our production and staging clusters. The Puppet config has a minimal set of variables/declarations that customize the environment that need be tweaked for the environment. From the point of view of the Puppet config, it's just another cluster.
We periodically produce a new Vagrant box with updates whenever there are new apps or new system services. Updating the box is a matter of booting a new clean box and packaging it; Puppet takes care of all the setup. We plan on automating the box builds at some point.
To make the workflow as painless as possible, we have internal "all-round monkey wrench" tool for everything a developer needs to interact with both the VM and our clusters, such as for fetching and installing a new box (we don't use Vagrant Cloud). One big benefit of using Vagrant is that this internal tool can treat it as just another cluster. The same commands we use to interact with prod/staging — to deploy a new app version, for example — are used to interact with the VM.
One notable configuration change we need for Vagrant is a special DNS server. Our little tool modifies the local machine (this is super easy on OS X) and tells it to use the VM's DNS server to resolve ".dev". The VM then runs dnsmasq, which resolves "*.dev" into its own IP. We also have an external .com domain that resolves to the internal IP, for things like Google's OAuth that requires a public endpoint. All the apps that run on the VM then respond to various hosts ending with .dev.
Another important configuration change is support for hot code reloading. This bit of magic has two parts:
- First, we use Vagrant shared folders to allow the developer to selectively "mount" a local application on the VM; when you deploy an app this way, instead of deploying from a Git repo, it simply uses the mounted folder, allowing you to run the app with your local code that you're editing.
- Secondly, when apps run on the VM, they have some extra code automatically injected by the deployment runtime that enables hot code reloading. For Node.js backends, we use a watch system that simply crashes the app on file changes; for the front end stuff, we simply swap out statically-built assets with dynamic endpoints for Browserify and SASS to build the assets every time the app asks for them (with incremental rebuilding, of course). For Ruby backends, we use Sinatra's reloader.
---
Overall, we are very happy with the Vagrant solution. The only major pain point we have faced is not really technical: It's been hard for developers to understand exactly how the box works. Every aspect of the stack needs to be documented so that developers can know where the look and what levers to pull when an app won't deploy properly or a queue isn't being processed correctly. Without this information, the box seems like black magic to some developers, especially those with limited experience with administrating Linux.
We also sometimes struggle with bugs in Vagrant or Virtualbox. For example, sometimes networking stops working, and DNS lookups fail. Or the VM dies when your machine resumes from sleep [1]. Or the Virtualbox file system corrupts files that it reads [2]. Or Virtualbox suddenly sits there consuming 100% CPU for no particular reason. It happens about once every week, so we're considering migrating to VMware.
Another option is to actually give people the option of running their VM in the cloud, such as on Digital Ocean. I haven't investigated how much work this would be. The downside would obviously be that it requires an Internet connection. The benefit would be that you could run much larger, faster VMs, and since they'd have public IPs you could easily share your VM and your current work with other people. Another benefit: They could automatically update from Puppet. The boxes we build today are configured once from Puppet, and then Puppet is excised entirely from the VM. Migrating to a new box version can be a little painful since you lose any test data you had in your old box.
As for your question about what services to run: It's a good question. Right now we only build a single box running everything, even though we have a few different front-end apps that people work on that all use the same stack. We'll probably split this into multiple boxes at some point, as memory usage is starting to get quite heavy. But since all the apps share 90% of the same backend microservices, the difference between the boxes are mostly going to be which front-end apps they run.
[1] https://www.virtualbox.org/ticket/13874
[2] https://www.virtualbox.org/ticket/819
chrisvxd|11 years ago
DanielBMarkham|11 years ago
akbar501|11 years ago
1.) How are you handling auth? Are you using a home grown solution or using OpenID Connect + OAuth 2.0?
2.) Is the JWT behind the firewall using a pre-shared key?
3.) What does the public token look like and how does the API Gateway perform auth? Does the token passed into the API Gateway contain only a user id? And does the API Gateway have to perform a database query to populate the full user object?
side note: Thanks for writing the article.
anton_gogolev|11 years ago
jamiesoncj|11 years ago
zerr|11 years ago
Even for the recently posted free 3-chapters from NGINX micro services book - while skimming, I had this constant feel of "show me the code".
codewithcheese|11 years ago
tmwatson100|11 years ago
Stuff that is shared between apps is often related to 3rd party integrations, which could be moved into a separate (often asynchronous) worker/ service. In reality most of these design choices are done on a case by case basis, based on time/ cost/ maintenance.
FooBarWidget|11 years ago
hannes2000|11 years ago
Judging from your team size (3 engineers on the team page), this is probably still a very normal-sized microservice :)
gabrtv|11 years ago
This reads like the user ID is exposed in a header without any sort of encryption.
adaml_623|11 years ago
A lean quick service is not going to want to wait on encryption handshaking.
AndrewHampton|11 years ago
[deleted]
scient|11 years ago
[deleted]