(no title)
tritiy
|
1 year ago
My guess is the following:
Every time you talk with the LLM it starts with random 'state' (working weights) and then it reads the input tokens and predicts the followup. If you were to save the 'state' (intermediate weights) after inputing the prompt but before inputing user input your would be getting the same output of the network which might have a bias or similar which you have now just 'baked in' into the model.
In addition, reading the input prompts should be a quick thing ... you are not asking the model to predict the next character until all the input is done ... at which point you do not gain much by saving the state.
cma|1 year ago
pegasus|1 year ago