top | item 43786709

(no title)

imoreno | 10 months ago

Yes let's not say what's wrong with the tech, otherwise someone might (gasp) fix it!

discuss

order

rybosworld|10 months ago

Tuning the model output to perform better on certain prompts is not the same as improving the model.

It's valid to worry that the model makers are gaming the benchmarks. If you think that's happening and you want to personally figure out which models are really the best, keeping some prompts to yourself is a great way to do that.

namaria|10 months ago

There is no guarantee for you that by keeping your questions to yourself that no one else has published something similar. This is bad reasoning all the way through. The problem is in trying to use a question as a benchmark. The only way to really compare models is to create a set of tasks of increasing compositional complexity and running the models you want to compare through them. And you'd have to come up with a new body of tasks each time a new model is published.

Providers will always game benchmarks because they are a fixed target. If LLMs were developing general reasoning, that would be unnecessarily. The fact that providers do is evidence that there is no general reasoning, just second order overfitting (loss on token prediction does descend, but that doesn't prevent the 'reasoning loss' to be uncontrollable: cf. 'hallucinations').

ls612|10 months ago

Who’s going out of their way to optimize for random HNers informal benchmarks?

aprilthird2021|10 months ago

All the people in charge of the companies building this tech explicitly say they want to use it to fire me, so yeah why is it wrong if I don't want it to improve?

idon4tgetit|10 months ago

"Fix".

So long as the grocery store has groceries, most people will not care what a chat bot spews.

This forum is full of syntax and semantics obsessed loonies who think the symbolic logic represents the truth.

I look forward to being able to use my own creole to manipulate a machine's state to act like a video game or a movie rather than rely on the special literacy of other typical copy-paste middle class people. Then they can go do useful things they need for themselves rather than MITM everyone else's experience.

genewitch|10 months ago

A third meaning of creole? Hub, I did not know it meant something other than a cooking style and a peoples in Louisiana (mainly). As in I did not know it was a more generic term. Also, in the context you used it, it seems to mean a pidgin that becomes a semi-official language?

I also seem to remember that something to do with pit bbq or grilling has creole as a byproduct - distinct from creosote. You want creole because it protects the thing in which you cook as well as imparts flavor, maybe? Maybe I have to ask a Cajun.