top | item 46755682

(no title)

rbranson | 1 month ago

Biggest thing to watch out with this approach is that you will inevitably have some failure or bug that will 10x, 100x, or 1000x the rate of dead messages and that will overload your DLQ database. You need a circuit breaker or rate limit on it.

discuss

rr808|1 month ago

I worked on an app that sent an internal email with stack trace whenever an unhandled exception occurred. Worked great until the day when there was an OOM in a tight loop on a box in Asia that sent a few hundred emails per second and saturated the company WAN backbone and mailboxes of the whole team. Good times.

with|1 month ago

This is the same risk with any DLQ.

The idea behind a DLQ is it will retry (with some backoff) eventually, and if it fails enough, it will stay there. You need monitoring to observe the messages that can't escape DLQ. Ideally, nothing should ever stay in DLQ, and if it does, it's something that should be fixed.

microlatency|1 month ago

What do you use for the monitoring of DLQs?

shayonj|1 month ago

This! Only thing worse than your main queue backing off is you dropping items from going into the DLQ because it can’t stay up.

pletnes|1 month ago

If you can’t deliver to the DLQ, then what? Then you’re missing messages either way. Who cares if it’s down this way or the other?

xyzzy_plugh|1 month ago

Not necessarily. If you can't deliver the message somewhere you don't ACK it, and the sender can choose what to do (retry, backoff, etc.)

Sure, it's unavailability of course, but it's not data loss.

RedShift1|1 month ago

The point is to not take the whole server down with it. Keeps the other applications working.

rbranson|1 month ago

Sure, but you still need to design around this problem. It’s going to be a happy accident that everything turns out fine if you don’t.

plaguuuuuu|1 month ago

Could one put the DLQ messages on a queue and have a consumer ingest into pg?

(The queue probably isnt down if you've just pulled a message off it)

j45|1 month ago

It will happen eventually in any system.

No need to look down on PG because it makes it more approachable and is more longer a specialized skill.