(no title)
fridental | 2 years ago
I am also not sure if the author has valid assumptions about business requirements - thumbnails being not generated doesn't look like a reason to be waked up at night to me.
Also you usually have an intern or two in the support team who can re-run failed jobs and don't need to waste time of your devs or ops or devops for that.
barrkel|2 years ago
Thumbnails not being generated might not be worth an early morning alarm, but running out of disk space might be, or not getting to do other work because it's blocked by the failure of thumbnail generation.
fridental|2 years ago
This means: merge the PR first, let it go live, use your working students or interns to rerun stuff, wait for a month - if it is still happening, then you have a proof of a problem that needs to be fixed.
Disk space: use your monitoring tool to proactively warn you when the free disk space is below of 20% or is reducing too quickly.
If some other work is blocked by failed thumbnails, this is a logical bug and not the consequence of a message bus. This stuff has been blocked even before the introduction of the message bus anyways.