Biggest thing to watch out with this approach is that you will inevitably have some failure or bug that will 10x, 100x, or 1000x the rate of dead messages and that will overload your DLQ database. You need a circuit breaker or rate limit on it.
I worked on an app that sent an internal email with stack trace whenever an unhandled exception occurred. Worked great until the day when there was an OOM in a tight loop on a box in Asia that sent a few hundred emails per second and saturated the company WAN backbone and mailboxes of the whole team. Good times.
The idea behind a DLQ is it will retry (with some backoff) eventually, and if it fails enough, it will stay there. You need monitoring to observe the messages that can't escape DLQ. Ideally, nothing should ever stay in DLQ, and if it does, it's something that should be fixed.
rr808|1 month ago
with|1 month ago
The idea behind a DLQ is it will retry (with some backoff) eventually, and if it fails enough, it will stay there. You need monitoring to observe the messages that can't escape DLQ. Ideally, nothing should ever stay in DLQ, and if it does, it's something that should be fixed.
microlatency|1 month ago
shayonj|1 month ago
pletnes|1 month ago
xyzzy_plugh|1 month ago
Sure, it's unavailability of course, but it's not data loss.
RedShift1|1 month ago
rbranson|1 month ago
plaguuuuuu|1 month ago
(The queue probably isnt down if you've just pulled a message off it)
j45|1 month ago
No need to look down on PG because it makes it more approachable and is more longer a specialized skill.