top | item 40754009

(no title)

monus | 1 year ago

I built crik[1] to orchestrate CRIU operations inside a container running in Kubernetes so that you can migrate containers when spot node gets a shutdown signal. Presented it at KubeCon Paris 2024 [2] with a deep dive for those interested in the technical details.

[1]: https://github.com/qawolf/crik

[2]: The Party Must Go On - Resume Pods After Spot Instance Shutdown, https://kccnceu2024.sched.com/event/1YeP3

discuss

JoosToopit|1 year ago

My process connects to, say, Postgres. What's going to happen to that connection upon restore?

Does crik guarantee the order of events (saving a checkpoint should be followed by killing the old process/pod, which should be followed by a restoration - the order of these 3 events is strict) and given that criu can checkpoint and restore sockets state correctly - how does that work for kubernetes? The new pod will have a different IP.

monus|1 year ago

TCP connections are identified with source IP:port and target IP:port tuples. When a new pod is created, it gets a new IP so there is not much way to restore the TCP connections. So crik drops all TCP connections and lets the application handle the reconnection logic. There are some CNIs that can give a static IP to pod, but that’s rather unorthodox in k8s.

rmetzler|1 year ago

> The new pod will have a different IP.

Usually clients would connect to a Kubernetes svc to not have the problem with changing IPs. Even for just a single pod I would do that.

alexeldeib|1 year ago

great talk! I’m curious about an approach like this combined with CUDA checkpoint for GPU workloads https://github.com/NVIDIA/cuda-checkpoint

Animats|1 year ago

This makes sense for checkpointing and restoring long ML training runs.

Doing this on a networked application is going to be iffy. The restored program sees a time jump. The world in which it lives sees a replay of things the restore program already did once, if restore is from a checkpoint before a later crash.

If you just want to migrate jobs within a cluster, there's Xen.