(no title)
incidentiq | 1 month ago
1. "AI SRE" useful or hype? Useful in specific contexts, but the trust barrier is real. Most on-call engineers are skeptical of AI suggestions during incidents because the cost of a wrong recommendation at 3am is high. That said, the pain of digging through logs and finding relevant context is also real.
2. Where it helps: The biggest wins are in "pre-work" - surfacing relevant past incidents before you start investigating, correlating alerts that are likely related, and summarizing what changed recently. Reducing the "context gathering" phase which often eats 30%+ of incident time.
3. Trust requirements: For me to trust it: - Show confidence levels and your reasoning. "Here's what I found and why" beats "do this." - Be a copilot that accelerates my investigation, not one that acts on my behalf. - Get the easy stuff 100% right before attempting the hard stuff. If log correlation is wrong on obvious patterns, I won't trust root cause suggestions.
The RAPTOR approach for runbooks is interesting - the "loss of context in chunked RAG" problem is real for long-form incident docs. How do you handle cases where relevant context spans multiple documents (runbook references an architecture doc)?
No comments yet.