(no title)
DrammBA | 1 month ago
Also tests and proof checkers only catch what they’re asked to check, if the LLM misunderstands intent but produces a consistent implementation+proof, everything “passes” and is still wrong.
DrammBA | 1 month ago
Also tests and proof checkers only catch what they’re asked to check, if the LLM misunderstands intent but produces a consistent implementation+proof, everything “passes” and is still wrong.
simonw|1 month ago
tsimionescu|1 month ago
daxfohl|1 month ago
Given so much of the work of managing these systems has become so rote now, my only conclusion is that all that's left (before getting to 95+% engineer replacement) is an "agent engineering" problem, not an AI research problem.