top | item 40499031

(no title)

Since models are very good at writing very short computer programs, and computer programs are very good at mathematical calculations, would it not be considerably more efficient to train them to recognise a "what is x + y" type problem, and respond with the answer to "write and execute a small javascript program to calculate x + y, then share the result"?

discuss

Grimblewald|1 year ago

From a getting answers perspective yes, from an understanding LLMs perspective no. If you read the avstract you can see how this goes beyond arithmetic and helps with longform reasoning

simiones|1 year ago

But that's not all that relevant to the question "can LLMs do math". People don't really need ChatGPT to replace a calculator. They are interested in whether the LLM has learned higher reasoning skills from it's training on language (especially since we know it has "read" more math books than any human could in a lifetime). Responding with a program that reuses the + primitive in JS proves no such thing. Even responding with a description of the addition algorithm doesn't prove that it has "understood" maths, if it can't actually run that algorithm itself - it's essentially looking up a memorized definition. The only real proof is actually having the LLM itself perform the addition (without any special-case logic).

This question is of course relevant only in a research sense, in seeking to understand to what extent and in what ways the LLM is acting as a stochastic parrot vs gaining a type of "understanding", for lack of a better word.

Shrezzing|1 year ago

That's a fair summary of why the research is happening. Thanks.

gmerc|1 year ago

That's in fact what ChatGPT does ... because 99% accurate math is not useful to anyone.

ADeerAppeared|1 year ago

This is a cromulent approach, though it would be far more effective to have the LLM generate computer-algebra-system instructions.

The problem is that it's not particularly useful: As the problem complexity increases, the user will need to be increasingly specific in the prompt, rapidly approaching being fully exact. There's simply no point to it if your prompt has to (basically) spell out the entire program.

And at that point, the user might as well use the backing system directly, and we should just write a convenient input DSL for that.

unknown|1 year ago

[deleted]

andrepd|1 year ago

Yes, this is what external tools/plugins/api calls are all about.