No, I tested the paid GPT-4 last year on similar questions (animal cognition) and it was so bad I decided it was a waste of money. I actually don't care if it's maybe gotten better in the past year, and I'm certainly not spending money to find out. Last I checked the best LLMs still have a 5-15% confabulation rate on simple document summarization. In 2023 GPT-4 had a ~75% confabulation rate on animal cognition questions, but even 5% is not reliable enough for me to want to use it.
The high school AI tutor probably wasn't using GPT-4, but the district definitely paid a lot of money for the software.
I also hate this entire argument, that AI confabulations don't matter for free products. Unreliable software like GPT-4o shouldn't be widely released to the public as a cool new tech product, and certainly not handed out for free.
I have tried some chemistry problems on the latest models and they still get simple math wrong (mess up conversion between micro and milligrams for example) unless you tell them to think carefully.
nicklecompte|1 year ago
The high school AI tutor probably wasn't using GPT-4, but the district definitely paid a lot of money for the software.
I also hate this entire argument, that AI confabulations don't matter for free products. Unreliable software like GPT-4o shouldn't be widely released to the public as a cool new tech product, and certainly not handed out for free.
walterbell|1 year ago
andrepd|1 year ago
jeffreyrogers|1 year ago