(no title)
barishnamazov | 1 month ago
Not that humans can't make these mistakes (in fact, I have nuked my home directory myself before), but I don't think it's a specific problem some guardrails can solve currently. I'm looking for innovations (either model-wise or engineering-wise) that'd do better than letting an agent run code until a goal is seemingly achieved.
No comments yet.