top | item 41358748

(no title)

upon_drumhead | 1 year ago

If an issue can be automatically detected and remediated, do you really need a runbook? That space has to be huge. I don't see a purpose for documenting it.

That said, a tool that runs through existing runbooks and improves them or suggests new ones would be extremely useful IMHO.

discuss

order

cortesoft|1 year ago

> I don't see a purpose for documenting it.

Because when it goes wrong you will want to know what it did. When you discover something new, you are going to want to be able to change the runbook. New employees are going to want to learn how things work from the runbook.

Why WOULDN'T you want to document what it is doing? I would never trust an AI that didn't tell me what it was doing and why.

threeseed|1 year ago

> I don't see a purpose for documenting it.

Enterprises implement stringent Change Management procedures.

If you are making any change to a Prod environment it needs to be thoroughly documented.

Atotalnoob|1 year ago

Improving documentation.

Keep in mind, they are suggestions. It sounds like the product will automatically execute runbooks but hold suggestions for engineer input. This would move it from “suggestion” to “automatically do X”

Also, sometimes LLMs are wrong.

jtsaw|1 year ago

The product will automatically execute runbooks for you. So far we've focused on using runbooks customers already have, since they know they work for them. We've also added the ability to turn of automatic execution for cases like a suggested runbook, so the customer can make any edits if necessary before approving it to be executed automatically.

Yea, this is a big challenge for us. We're using a variety of strategies to make sure hallucinations are rare, but that's why we're also committed to not executing actions that modify your cluster unless explicitly specified in a runbook