(no title)
markasoftware | 1 month ago
If Modal's customers' workloads are mainly GPU-bound, then the performance hit of gvisor isn't as big as it might be for other workloads. GPU activity does have to go through the fairly heavyweight nvproxy to be executed on the host, but most gpu activity is longer-lived async calls like running kernels so a bit of overhead in starting / retrieving the results from those calls can be tolerated.
Imustaskforhelp|1 month ago
So I can agree that perhaps Modal might make sense for LLM's but they position themselves as sandbox including something like running python code etc. and some of this may be more intensive in workflows than others so I just wanted to point it out
Fly.io uses firecracker so I kinda like firecracker related applications (I tried to run firecracker myself its way too hard to build your own firecracker based provider or anything) and they recently released https://sprites.dev/
E2B is another well known solution out there. I have talked to their developers once and they mentioned that they run it on top of gcp
I am really interested in kata containers as well because I think kata runs on top of firecracker and can hook with docker rather quickly.
unknown|1 month ago
[deleted]
amitprasad|1 month ago
Kata runs atop many things, but is a little awkward because it creates a "pod" (VM) inside which it creates 1+ containers (runc/gVisor). Firecracker is also awkward because GPU support is pretty hard / impossible.
[1] https://fly.io/blog/wrong-about-gpu/