(no title)
l_t | 4 years ago
Conceptually, the important thing is each stage waits to "ACK" the message until it's durably persisted. And when the message is sent to the next stage, the previous stage _waits for an ACK_ before assuming the handoff was successful.
In the case that your application code is down, the other party should detect that ("Oh, my webhook request returned a 502") and handle it appropriately -- e.g. by pausing their webhook queue and retrying the message until it succeeds, or putting it on a dead-letter queue, etc. Your app will be "out of sync" until it comes back online and the retries succeed, but it will eventually end up "in sync."
Of course, the issue with this approach is most webhook providers... don't do that (IME). It seems like webhooks are often viewed as a "best-effort" thing, where they send the HTTP request and if it doesn't work, then whatever. I'd be inclined to agree that kind of "throw it over the fence" webhook is not great and risks permanent desync. But there are situations where an async messaging flow is the right decision and believe it or not, it can work! :)
atombender|4 years ago
For example, you rolled out code on the receiver side that did the wrong thing with each message. Now there's no way to replay the old webhooks events in order to reinstate the right behaviour; there's no way to ask the producer to send them again.
The only way around this is to store a record of every received message on the receiver side, too, which the article author thinks is an unnecessary burden compared to polling.
Personally, I think push is an antipattern in situations where data needs to be kept in sync. The state about where the consumer is in the stream should be kept at the consumer side precisely so it can go back and forth.
curryst|4 years ago
Of course, then you need a way for the receiver to retrigger or view the webhook if one gets missed, which starts to look like you have to have a polling endpoint anyways, though.
BasieP|4 years ago
I know the goal for most systems is just to be 'up to date' Not to get the entire history. So in most cases you don't need to stash all the messages, you just need to be able to retreive the latest state of stuff...
ThrowawayR2|4 years ago
Embedded systems don't do that for webhooks because they can't (very little RAM or non-volatile storage) but customers clamor for webhooks anyway because it's what their web developers know how to use. So inevitably they're going to lose data but they're only getting what they asked for.