top | item 47163503

(no title)

trhway | 3 days ago

>I don’t know how you get here from “predict the next word.”

The question puts horse behind the buggy. The main point isn't "from", it is how you get to “predict the next word.” During the training the LLM builds inside itself compressed aggregated representation - a model - of what is fed into it. Giving the model you can "predict the next word" as well as you can do a lot of other things.

For simple starting point for understanding i'd suggest to look back at the key foundational stone that started it all - "sentiment neuron"

https://openai.com/index/unsupervised-sentiment-neuron/

"simply predicting the next character in Amazon reviews resulted in discovering the concept of sentiment.

...

Digging in, we realized there actually existed a single “sentiment neuron” that’s highly predictive of the sentiment value."

discuss

No comments yet.