top | item 47101351

(no title)

rybosworld | 8 days ago

No, it wouldn't be enough to falsify.

This isn't an experiment a consumer of the models can actually run. If you have a chance to read the article I linked, it is difficult even for the model maintainers (openai, anthropic, etc.) to look into the model and see what it actually used in it's reasoning process. The models will purposefully hide information about how they reasoned. And they will ignore instructions without telling you.

The problem really isn't that LLM's can't get math/arithmetic right sometimes. They certainly can. The problem is that there's a very high probability that they will get the math wrong. Python or similar tools was the answer to the inconsistency.

discuss

order

simianwords|8 days ago

What do you mean? You can explicitly restrict access to the tools. You are factually incorrect here.

rybosworld|8 days ago

I believe you're referring to the tools array? https://developers.openai.com/api/docs/guides/tools/

This is external tools that you are allowing the model to have access to. There is a suite of internal tools that the model has access to regardless.

The external python tool is there so it can provide the user with python code that they can see.

You can read a bit more about the distinction between the internal and external tool capabilities here: https://community.openai.com/t/fun-with-gpt-5-code-interpret...

"I should explain that both the “python” and “python_user_visible” tools execute Python code and are stateful. The “python” tool is for internal calculations and won’t show outputs to the user, while “python_user_visible” is meant for code that users can see, like file generation and plots."

But really the most important thing, is that we as end-users cannot with any certainty know if the model used python, or didn't. That's what the alignment faking article describes.