This is just some MVP java app with an nginx proxy, elastic-search, and mysql? Why is this running in k8s on aws via Kops in conjunction with a cloudformation template for aurora inside of a VPC? With containers separately built via packer +ansible kicked off via jenkins after each pull request merge? The CI pipeline also kicks off a canary deployment in our UAT environment with a prometheus exporter for monitoring on our grafana dashboard and if you want to see logs you have to look at our ELK stack for that and if you want to do any debugging we have Jaeger for tracing across our service mesh which is based on Istio using Calico as a network overlay. All that for an app does some crappy knock-off of trello.
newsbinator|7 years ago
autotune|7 years ago
hardwaresofton|7 years ago
I do lament that tools have become so complicated, but just about every piece mentioned (except for maybe kubernetes?) actually does a thing that is likely useful to you in production. A run down:
Java - yikes but OK, the JVM is an excellent piece of software, of course you need the actual app you want to run
NGINX - TLS termination, compression, timeouts, rate limiting -- don't have to put this in your app and deal with SpringWhatchaMaCallits if you just do it @ nginx. complexity here means less @ the app level
ElasticSearch - I'm not sure ElasticSearch is the right tool but usually for most apps you'll want search, and you'll want search that is good enough (aka not slow/doesn't suck)
MySQL - A database of some sort is necessary, and most embeedded/single file databases aren't the right fit for a multi-user frequent-concurrent access model, though I do love me some SQLite so I might fight myself on that point.
containers - containers are sandboxed, resource-constrained processes. I believe that it's better to run containerized processes rather than regular processes (i.e. just running your app) because of this isolation. Can't have one app clobber settings/whatever for another if they're properly isolated.
k8s (kubernetes) - a container orchestration tool -- if you're going to run containers on more than one machine then maybe you want to be able to treat all those machines more simply without manually managing all of them. This is obviously not necessary, If you only have to manage 2 machines, maybe just set up systemd properly and make sure two processes are always running and the internet can get to them, whatever.
kops - A tool made for easily provisioning machines on AWS into kubernetes clusters -- only necessary if you've bought in to containers and kubernetes.
cloudformation - AWS's tool for automating infrastructure creation and management -- while cloudformation might be hard to use, time spent automating pays time dividends that approach infinity (don't quote me on this), as long as you actually finish building the automation.
aurora - AWS's offering of a RBDMS, with some scaling advantages and other amazon-secret-sauce stuff. In the example this is deployed with cloudformation so someone on your team doesn't have to go into the actual amazon dashboard and click around for 5 minutes.
VPC - AWS's rendition of a private network for you to use their services on... you probably don't want to not run your stuff in a VPC (I don't even know if that's possible anymore) -- hard to know who else is running in your cloud.
packer - Packer is less important for containers but more for VMs (but I think you can use it for containers too?) -- basically if you have your source code in a folder, it would be nice if you could build a container for it automatically. Packer was used more predominantly for building VM images (AMIs for AWS) IIRC -- this helps to automate the deployment process, you can ensure every machine you start in the cloud or at home has a base level of configuration/software installed. You could even make your deploy artifact (the one thing you need to deploy your app) a VM image or an AMI and all you have to do is spin up a machine and your app is running.
ansible - fantastic automation tool for when you have to perform a task across one or more machines, for example, pushing a container image to a certain machine, or adding a user, or installing docker, or whatever.
jenkins - general task runner and automation helper differing from ansible in that it's used more through it's web interface, and is always-on, so it can do things like running tests whenever code is pushed to a given repository. This speeds up your team by letting them know when something's broken faster. Also, it can do things like deploys, post-deploy smoke tests, etc -- the more things that some random software does, the less I have to worry about jeff/gina fucking up a deploy|test|whatever.
CI - continuous integration is nice -- run your tests automatically so devs don't have to worry about it, make the results visible so no one merges bad code. Even better, make it impossible to merge code that doesn't pass the tests, or ensure that a certain amount of coverage is achieved (though code coverage can be a bad metric)
canary deployments in UAT - This is actually something mostly advanced engineering orgs do. User Acceptance Testing is arguably the only testing that matters, because if your user can't actually do what your app is supposed to do, your app may as well not exist, no matter how well it is built and unit/integration tested. UAT is when you get a person to sit and actually use the app and do what was supposed to be possible. "Canary deployment" is not really the right term here but in context I think wetpaste is referring to spinning up a "fake" version of your app, so that testers can touch something close to the actual thing.
canary deployments - Canary deployments are more traditionally doing a small deployment of a new service (let's say, serve 10% of your users the new version of an app), and observe/watch for errors before letting a deployment go wild. Again, mostly this is only done at really advanced engineering orgs.
Prometheus - Relatively simple and robust monitoring tool for dealing with time series data
Grafana - dashboarding tool that takes input from a few places (prometheus, RBDMS, etc) and charts data so you can easily see how your stuff is doing -- RED (Rate Errors Duration) is a pretty decent monitoring methodology for web services
ELK (ElasticSearch, Logstash, Kibana) - This stack is a little less necessary IMO but logstash pushes your logs to elasticsearch, and kibana makes them visible from the web. Logging is important of course, but you could probably just SSH in and look at the logs or rotate and extract files (or use something like cloudwatch logs if you're on AWS) happily for years.
Jaeger - operation-level tracing so you can easily find out how long different things are taking on your web service -- yes the `/expensive-request` is taking 5000ms, but which parts of what's actually happening are causing it, in production? the DB request? JSON munging on the API side? some other thing?
Service mesh - Much like NGINX, this is a way to outsource things like monitoring code (which you'd have to integrate into every app you wanted monitoring) to the transport layer -- you talk to a proxy (whether per-process or per-underlying-machine), and the proxies ferry your messages to wherever they're supposed to go. They also support service discovery, so now your java app doesn't need to know exactly where it's mysql instance is (which you'd normally feed in with ENV variables), it can just send stuff to mysql://my-mysql or whatever, and as long as the mesh is properly configured, it will go to the right place, and the mesh can do things like telling you how long every request took, or do circuit breaking, or retries, or whatever.
Istio - Service mesh that's built for kubernetes, it does the usual service mesh stuff plus some more, like super configurable routing (the kind that might enable canary deploys) to providing mutual TLS traffic between services automatically, and cluster wide authN/authZ
Calico - Sort of assumes the buy in to containers & kubernetes -- if you're going to run multiple containers on multiple machines but don't want to know every machine's IP and every container's IP, you're going to need a network overlay that simplifies things. In addition to reachability, calico also enables kubernetes's NetworkPolicy controls, so you can restrict intra-cluster communications
That said, I do abhor incidental complexity, and do like simple dependable tools -- however, most of these tools do actually serve a purpose and aren't just bloat. I think these days you can start off as simple as you like, and by all means fight complexity as you see it rear it's ugly head, but at some point, you're going to want to know which requests are slow for your users. At that point, you need to make the choice between a service mesh and a standalone jaeger instance -- that is when it's useful that you know both things exist, so you don't pick the tool that is more complicated (the service mesh) when you don't have to.
exikyut|7 years ago
"Yeah, no. I have no plans to try and actually make anything with that tangle. Hah."
"Although... I do know what nginx, java, containers, k8s, VPCs, ansible, jenkins, and CI are... so... maybe it's really all the same at the end of the day, I'll see everything in this list similarly 5 years from now, and I should see if I can tolerate it?"
"Hm. There is the small fact that I don't really know what k8s, VPCs, ansible and jenkins actually do, let alone the entire first list, and of all of these I've probably only really used Java, and that only a few times. I think I've had a few minutes' look at GitLab's Grafana dashboard once or twice?"
"I'll bet installing all of these, including all dependencies, would take probably multiple tens of gigabytes of diskspace, and probably consume more RAM that I have installed." (I'm running short on both, I have a few hundred MB of diskspace free right now, and never any free RAM :D)
"I wonder how effectively I can learn these on the job?"
"...I wonder if there are any jobs that don't require $tool_existence+1 years of experience with any of these before they even consider candidates ._."
Goes back to writing PHP script in text editor
Pr3fix|7 years ago
yahyaheee|7 years ago
TeMPOraL|7 years ago
madushan1000|7 years ago
serverlessadmin|7 years ago