top | item 39659850

(no title)

Pwntheon | 2 years ago

Take this with a grain of salt because I'm not super well read on llms, but isn't their entire function built on prediction?

Sounds like a reasonable approach could be to have a separate "channel" which focuses entirely on the concept of "where is this conversation going?" could give a pretty good baseline for when and how to interject.

discuss

order

mlsu|2 years ago

We don't have a model for "Where the conversation is going," we have a model for "What's the next token" which implicitly models "Where is the conversation going."

The difference is significant here, because direct manipulation the implicit modeling task is required to do the type of planning that I've described.

It's the same reason these LLM are not "agents." It's because you can only manipulate their world model through the interface of tokens.