top | item 41365289

(no title)

tritiy | 1 year ago

My guess is the following: Every time you talk with the LLM it starts with random 'state' (working weights) and then it reads the input tokens and predicts the followup. If you were to save the 'state' (intermediate weights) after inputing the prompt but before inputing user input your would be getting the same output of the network which might have a bias or similar which you have now just 'baked in' into the model. In addition, reading the input prompts should be a quick thing ... you are not asking the model to predict the next character until all the input is done ... at which point you do not gain much by saving the state.

discuss

cma|1 year ago

No, any randomness is from the temperature setting that just tells mainly tells how much to sample the probability mass of the next output vs choose the exact next most likely (which tends to make them get in repetitive loop like convos).

pegasus|1 year ago

There's randomness besides what's implied by the temperature. Even when temperature is set to zero, the models are still nondeterministic.