Those are about distributed consensus, making sure participants come to the same conclusion about something and nobody has the wrong answer.
Distributed snapshots are trying to do as little work as possible to get a consistent view of the distributed computation, without forcing the heavy cost of consensus on it. For example, node A is sending a message to node B, we don't care if we capture
- 1: A before it sends the message, B before it receives the message
- 2: A after it has sent the message, the message, and B before it receives the message
- 3: A after it has sent the message, B after it has received the message
No matter which of those states we restore, the computation will continue correctly.
jeffreygoesto|1 year ago
[0] https://blog.acolyer.org/2015/04/22/distributed-snapshots-de...
scrubs|1 year ago
wg0|1 year ago
yencabulator|1 year ago
Distributed snapshots are trying to do as little work as possible to get a consistent view of the distributed computation, without forcing the heavy cost of consensus on it. For example, node A is sending a message to node B, we don't care if we capture
- 1: A before it sends the message, B before it receives the message
- 2: A after it has sent the message, the message, and B before it receives the message
- 3: A after it has sent the message, B after it has received the message
No matter which of those states we restore, the computation will continue correctly.