top | item 47205186

(no title)

podnami | 14 hours ago

What happens before the probability distribution? I’m assuming say alignment or other factors would influence it?

discuss

DavidSJ|14 hours ago

In microgpt, there's no alignment. It's all pretraining (learning to predict the next token). But for production systems, models go through post-training, often with some sort of reinforcement learning which modifies the model so that it produces a different probability distribution over output tokens.

But the model "shape" and computation graph itself doesn't change as a result of post-training. All that changes is the weights in the matrices.