(no title)
hmage | 2 years ago
I don't know how exactly models work, but they managed to make Claude instant dumber than it is by default (see below)
Here's your question asked verbatim to all the models I have access to:
Claude instant:
Step by step working is as below:
* 20 + 20 = 40
* 40 + 20 = 60
* 60 + 21 = 81
Hence, 20 + 20 + 20 + 21 = 81
Claude v1: 20 + 20 + 20 + 21 = 81
Claude v2: Okay, let's solve this step-by-step: 20 + 20 + 20 + 21 = 20 + 20 + 20 + 21 = 60 + 21= 81. Therefore, 20 + 20 + 20 + 21 = 81.
mpt-30b-chat: The sum of 20 + 20 + 20 + 21 is 81.
gpt-3.5-turbo: 20 plus 20 plus 20 plus 21 equals 81.
gpt-4: 81
hmage|2 years ago
I've lowered the temperature to lowest to avoid randomization of answers.
Prompt:
Claude instant: Claude v1: Claude v2: mpt-30b-chat: gpt-3.5-turbo: gpt-4: Here we see that smaller models like mpt (30b) and claude instant (52b) can't do math "inside their head" and need an aid by doing the calculations step by step.I guess that's why all the models default to doing step-by-step when they see a math problem.
im3w1l|2 years ago
So I'm curious what would happen if you prompted it to stall for time a bit with an answer like "hmm.... err... let's see.. what about 81?"