(no title)
picometer | 1 year ago
A good follow up question would be: why didn’t the other models do better on the 2nd-order question? Especially BLOOM and davinci-003, which were middling on the 1st-order question.
I agree on your overall criticism of the experimental protocol, though.
No comments yet.