top | item 44452226

(no title)

zekica | 8 months ago

The two parts of your statement don't go together. A list of potential output tokens and their probabilities are generated deterministically but the actual token returned is then chosen at random (weighted based on the "temperature" parameter and the probability value).

discuss

galaxyLogic|8 months ago

I assume they use software-based pseudo-random-number generators. Those can typically be given a seed-value which determines (deterministically) the sequence of random numbers that will be generated.

So if an LLM uses a seedable pseudo-random-number-generator for its random numbers, then it can be fully deterministic.

lou1306|8 months ago

There are subtle sources of nondeterminism in concurrent floating point operations, especially on GPU. So even with a fixed seed, if an LLM encounters two tokens with very close likelihoods, it may pick one or the other across different runs. This has been observed even with temperature=0, which in principle does not involve _any_ randomness (see arXiv paper cited earlier in this thread).

mzl|8 months ago

That depends on the sampling strategy. Greedy sampling takes the max token at each step.