top | item 27824467

(no title)

l_t | 4 years ago

Not disagreeing with your point, and I'm sure you already know this, I just wanted to point out (for the benefit of people that don't have other options) that it is possible to build "webhooks" in such a way that you're confident nothing is dropped and nothing goes (permanently) out of sync. (At least, AFAIK -- correct me if this sounds wrong!)

Conceptually, the important thing is each stage waits to "ACK" the message until it's durably persisted. And when the message is sent to the next stage, the previous stage _waits for an ACK_ before assuming the handoff was successful.

In the case that your application code is down, the other party should detect that ("Oh, my webhook request returned a 502") and handle it appropriately -- e.g. by pausing their webhook queue and retrying the message until it succeeds, or putting it on a dead-letter queue, etc. Your app will be "out of sync" until it comes back online and the retries succeed, but it will eventually end up "in sync."

Of course, the issue with this approach is most webhook providers... don't do that (IME). It seems like webhooks are often viewed as a "best-effort" thing, where they send the HTTP request and if it doesn't work, then whatever. I'd be inclined to agree that kind of "throw it over the fence" webhook is not great and risks permanent desync. But there are situations where an async messaging flow is the right decision and believe it or not, it can work! :)

discuss

atombender|4 years ago

This misses the problem explained in the article, which is that there are scenarios where events are "acked" but things still go wrong because of bugs.

For example, you rolled out code on the receiver side that did the wrong thing with each message. Now there's no way to replay the old webhooks events in order to reinstate the right behaviour; there's no way to ask the producer to send them again.

The only way around this is to store a record of every received message on the receiver side, too, which the article author thinks is an unnecessary burden compared to polling.

Personally, I think push is an antipattern in situations where data needs to be kept in sync. The state about where the consumer is in the stream should be kept at the consumer side precisely so it can go back and forth.

curryst|4 years ago

If you want to be 100% sure that you get all the webhooks, the sender could implement an incrementing "webhook ID". If the receiver knows the last webhook ID was 53 and the sender sends one for 55, you can tell one has been dropped. There are some other concerns around that like if 54 has been sent but they arrived out of order, or if they arrive almost simultaneously. Nothing that isn't solvable afaict though.

Of course, then you need a way for the receiver to retrigger or view the webhook if one gets missed, which starts to look like you have to have a polling endpoint anyways, though.

BasieP|4 years ago

We have a system that pushes loads of messages (as in thousands a minute) and some consumer insists on using there http backend to push the messages to. There system is down every once in a while for quite some time. We're using an async queueing solution, but you can't keep those messages forever. We sometimes have milions of messages for them in there queue's, which take up space... If all of our consumers had those problems we would have to buy loads of storage.. We're simply dropping messages older than x, and have an endpoint that they can call to retreive the 'latest state of things'. This way when they come back from a failure, they simply get the latest state, and then continue with updates from our end.. It's far from perfect, but it works really well.

I know the goal for most systems is just to be 'up to date' Not to get the entire history. So in most cases you don't need to stash all the messages, you just need to be able to retreive the latest state of stuff...

ThrowawayR2|4 years ago

> "Of course, the issue with this approach is most webhook providers... don't do that "

Embedded systems don't do that for webhooks because they can't (very little RAM or non-volatile storage) but customers clamor for webhooks anyway because it's what their web developers know how to use. So inevitably they're going to lose data but they're only getting what they asked for.