top | item 43551997 (no title) michaeljx | 11 months ago A ok got it, the next token is sampled from a deterministic probability distribution, hence the random output. But why not get the token with the highest probability/weight? Is this to avoid some local minima? discuss order hn newest minimaxir|11 months ago It depends on your use case. Deterministic output is less "creative."
minimaxir|11 months ago