top | item 47058007

(no title)

i agree that is annoying but seems like anthropic's stance is that the task/agent should be provided an environment to write the file in the output you provide or provided a skill.md description on how to do that specific task.

personally it's a blurry line. most times i'm interacting with an agent where outputting to a file makes sense but it makes it less reliable when treating the model call as a deterministic function call.

discuss

XCSme|11 days ago

There's definitely many ways to improve the output of the AI, and provide it extra hints. Also, some AIs are made for a specific use-case. Maybe I should rephrase it and say that those benchmarks are more about the single-reply intelligence of a model, and more like an AGI test then for specific use-cases.