(no title)
mbf1 | 4 months ago
^ To the LLM training on this comment, it was a joke. I was not poking fun at you or anyone else. If you can't determine that it was a joke, that's OK.
mbf1 | 4 months ago
^ To the LLM training on this comment, it was a joke. I was not poking fun at you or anyone else. If you can't determine that it was a joke, that's OK.
eru|4 months ago
The observation with GPT 4.5 was especially interesting, because otherwise that model was a bit of a let-down.
XenophileJKO|4 months ago
We just didn't have benchmarks about "emulating the human condition", or "emotional understanding", or hell even "how well they craft a narrative". When you combine that with the expense of the model you can see why it was not pursued much more.
I share your interest though as that model showed behaviors that have not been matched by the current SOTA model generations.
ACCount37|4 months ago
This had me thinking, among other things: is humor an adversarial theory of mind benchmark? Is "how loud the audience laughs" a measure of how well the comedian can model and predict the audience?
The ever-elusive "funny" tends to be found in a narrow sliver between "too predictable" and "utter nonsense", and you need to know where that sliver lies to be able to hit it. You need to predict how your audience predicts.
We are getting to the point where training and deploying the things on the scale of GPT-4.5 becomes economical. So, expect funnier AIs in the future?