I see a lot of notes about EFS's performance in the comments. I figured it's at least worth noting, for anyone considering using ECS with EFS, that just last week EFS had its read throughput on its general purpose tier increased by 400%.
That probably won't solve all EFS performance issues, but it's a pretty big boost and a nice announcement to come alongside ECS support.
Yes, these containers are supposed to be stateless, but I was tasked with converting an app at my previous job over to using ECS on Fargate and we hit so many issues because of the limits on storage per container instance. We ended up having to tweak the heck out of nginx caching configurations and other processes that would generate any "on disk" files to get around the issues. Having EFS available would have made solving some of those problems so much easier.
I've also been wanting to use ECS on Fargate for running scheduled tasks with large files (50gb+) but it wasn't really possible given the previous 4gb limit on storage.
> Yes, these containers are supposed to be stateless,
You got it backwards. NFS type services help containers be stateless because they are a separate service accessed through an interface where all the state is handled by a third party.
Thus by using a NFS-type service to store your local files, you are free to kill and respawn containers at will because their data is persisted elsewhere.
containers shouldn't necessarily be stateless; most existing codes don't know how or want to talk to services via RPC interfaces. In some sense, a mounted remote filesystem is just a standard API the OS provides you to access state in a convenient way that happens to be high performance, indexed, etc.
Oh man, awesome. We had a rather janky workload where ECS would spin up an EC2 that would then mount an EFS volume and then write a file over to S3. This is going to make that so much easier and cleaner.
If you're wondering why you'd ever have to do something like that, the answer is SAP.
How's the performance on EFS? Has anyone used it in production that is willing to share their experience?
We evaluated it for a relatively simple use case, and the performance seemed abysmal, so we didn't select it. I'm hoping that we made a mistake in our evaluation protocol, which would give me an excuse to give it another try.
It's terrible. Very slow when we tried to use it. There are ways to work around this, and ways to tune the performance, but honestly it was not worth it for our use case and instead we found a way to make EBS work.
EFS is a great way to get a lot of iowait on your cpu graphs. Would not recommend it for anything that had to be fast.
It is highly dependent on your needs. It's NFS, and performs accordingly (though EFS has been rock solid in a few different scenarios from an availability standpoint and a baseline performance, assuming you use dedicated IOPS).
Should you run a database on EFS? No. Can you use it to back media files for a web application that are cached using a CDN, or for data files used for processing or temporary storage? Yes, it shines in those use cases... and it's cheaper than dedicating the time required to maintain your own NFS cluster.
Even Gluster or Ceph is, IMO, not worth the effort unless you (a) know how to run and maintain it, and (b) absolutely need the potential speed up that you can get, assuming a well-configured and well-maintained system.
It feels like the performance and cost is really built around a very specific use case that basically boils down to "write logs and only read a tiny fraction of those logs".
And then, I've seen way too many people treat it like a traditional file system, and stick things on it that don't expect to find themselves on NFS, and wonder why they get corrupted files.
And, really, I tend to avoid the AWS services with "Burst Balances". It's painful to get a system running smoothly only to have it grind to a halt when you use it under load because some burst balance somewhere went to zero. Your mileage may vary, of course.
For a high performance shared file system on AWS, an alternative is ObjectiveFS[0]. It uses memory caching to achieve performance closer to local disk.
Technically it supported it before, but you had to configure everything manually (or with your own automation). Having it native is a lot nicer, and brings provisioning of NFS-style volumes up to par with the current Kubernetes experience.
txcwpalpha|5 years ago
That probably won't solve all EFS performance issues, but it's a pretty big boost and a nice announcement to come alongside ECS support.
https://aws.amazon.com/about-aws/whats-new/2020/04/amazon-el...
finaliteration|5 years ago
Yes, these containers are supposed to be stateless, but I was tasked with converting an app at my previous job over to using ECS on Fargate and we hit so many issues because of the limits on storage per container instance. We ended up having to tweak the heck out of nginx caching configurations and other processes that would generate any "on disk" files to get around the issues. Having EFS available would have made solving some of those problems so much easier.
I've also been wanting to use ECS on Fargate for running scheduled tasks with large files (50gb+) but it wasn't really possible given the previous 4gb limit on storage.
rumanator|5 years ago
You got it backwards. NFS type services help containers be stateless because they are a separate service accessed through an interface where all the state is handled by a third party.
Thus by using a NFS-type service to store your local files, you are free to kill and respawn containers at will because their data is persisted elsewhere.
dekhn|5 years ago
blueter|5 years ago
jboggan|5 years ago
If you're wondering why you'd ever have to do something like that, the answer is SAP.
koolba|5 years ago
mark242|5 years ago
sciurus|5 years ago
zapita|5 years ago
We evaluated it for a relatively simple use case, and the performance seemed abysmal, so we didn't select it. I'm hoping that we made a mistake in our evaluation protocol, which would give me an excuse to give it another try.
codeduck|5 years ago
EFS is a great way to get a lot of iowait on your cpu graphs. Would not recommend it for anything that had to be fast.
geerlingguy|5 years ago
Should you run a database on EFS? No. Can you use it to back media files for a web application that are cached using a CDN, or for data files used for processing or temporary storage? Yes, it shines in those use cases... and it's cheaper than dedicating the time required to maintain your own NFS cluster.
Even Gluster or Ceph is, IMO, not worth the effort unless you (a) know how to run and maintain it, and (b) absolutely need the potential speed up that you can get, assuming a well-configured and well-maintained system.
banana_giraffe|5 years ago
And then, I've seen way too many people treat it like a traditional file system, and stick things on it that don't expect to find themselves on NFS, and wonder why they get corrupted files.
And, really, I tend to avoid the AWS services with "Burst Balances". It's painful to get a system running smoothly only to have it grind to a halt when you use it under load because some burst balance somewhere went to zero. Your mileage may vary, of course.
objectivefs|5 years ago
[0]: https://objectivefs.com
geerlingguy|5 years ago
WatchDog|5 years ago
rkwasny|5 years ago
My advice, stick to EC2 + EBS, it works.
djstein|5 years ago
nnx|5 years ago