(no title)
alexwebb2 | 1 year ago
https://chatgpt.com/share/6722ca8a-6c80-800d-89b9-be40874c5b...
https://chatgpt.com/share/6722ca97-4974-800d-99c2-bb58c60ea6...
alexwebb2 | 1 year ago
https://chatgpt.com/share/6722ca8a-6c80-800d-89b9-be40874c5b...
https://chatgpt.com/share/6722ca97-4974-800d-99c2-bb58c60ea6...
TZubiri|1 year ago
1- running the query through a classifier to figure out if the question involves numbers or math 2- Extract the function and the operands 3- Do the math operation with standard non-LLM mechanisms 4- feed back the solution to the LLM 5- Concatenate the math answer with the LLM answer with string substitution.
So in a strict sense this is not very representative of the logical capabilities of an LLM.
digging|1 year ago
This confusion was introduced at the top of the thread. If the argument is "LLMs plus tooling can't do X," the argument is wrong. If the argument is "LLMs alone can't do X," the argument is worthless. In fact, if the argument is that binary at all, it's a bad argument and we should laugh it out of the room; the idea that a lay person uninvolved with LLM research or development could make such an assertion is absurd.
thomashop|1 year ago
unknown|1 year ago
[deleted]
astrange|1 year ago
https://chatgpt.com/share/6723477e-6e38-8000-8b7e-73a3abb652...
https://chatgpt.com/share/6723478c-1e08-8000-adda-3a378029b4...
https://chatgpt.com/share/67234772-0ebc-8000-a54a-b597be3a1f...
_flux|1 year ago
TaylorAlexander|1 year ago
https://machinelearning.apple.com/research/gsm-symbolic
famouswaffles|1 year ago
Changing names does not affect the performance of Sota models.
zmgsabst|1 year ago
Their errors appear to disappear when you correctly set the context from conversational to adversarial testing — and Apple is actually testing the social context and not its ability to reason.
I’m just waiting for Apple to release their GSM-NoOp dataset to validate that; preliminary testing shows it’s the case, but we’d prefer to use the same dataset so it’s an apples-to-apples comparison. (They claim it will be released “soon”.)
gruez|1 year ago
thomashop|1 year ago
If you want a more scientific answer there is this recent paper: https://machinelearning.apple.com/research/gsm-symbolic