(no title)
tornadofart | 1 month ago
A string value in a json config needed to be updated.
On one prod instance, typo while updating the config by hand. Config validation of the software caught it, software stopped with the appropriate error message, a few minutes later we were up and running again.
We introduced work reviews on prod instances (similar to code reviews) after that.
Later, he then wrote a patch script to avoid making that mistake again.
In the json schema definition used in the script, the name of the property had a typo (how it came to be... no clue, copy paste should have taken care of that).
The script was part of a MR, the reviewer missed the typo. We noticed it in staging.
We introduced tests for config editing scripts after that.
And so it went on and on... The problem is not that it happens and we then refine our processes. It is the frequency.
otterley|1 month ago
Fortify your delivery pipeline and the problem should resolve itself.
tornadofart|1 month ago