(no title)
tkahnoski | 2 years ago
Splunk was used by a much larger product (easily 10x our scale) for monitoring events so there was no red tape to start using it.
After launching the detailed instrumentation (1 structured log event per HTTP request with a breakout of database/service activity) I was able to gain all of the insight needed and build a simple user/url lookup dashboard page to help other engineers see what was going on. We went from being mostly blind to almost full visibility in less than two weeks.
The downside was, we increased our billable Splunk usage by 50% since we were capturing so much more data per log event than the other product just consuming standard IIS/Apache logs.
That type of flexibility was totally worth it. Due to some acquisition shenanigans we broke off from that group and wound up on ELK stack which didn't perform quite as well, but was still usable with the same data. In today's day and age we could have just built an OpenTelemtry library.
hparadiz|2 years ago
sib|2 years ago
closeparen|2 years ago
ilyt|2 years ago