top | item 37473288

(no title)

Icko | 2 years ago

> Primarily, they are pure functions that accept a sequence of tokens and return the next token. The model itself is stateless, and it doesn't seem right to me to ascribe "intent" to a stateless function. Even if the function is capable of modeling certain aspects of chess.

I have two arguments against. One, you could argue that state is transferred between the layers. It may be inelegant for each chain of state transitions to be the same length, but it seems to work. Two, it may not have "states", but if the end result is the same, does it matter?

discuss

bt1a|2 years ago

That's a great way of looking at it. Comparing model weights to our brains and how we process input, you could imagine model weights as a brain frozen at time t=0. The prompt tokens are the sensory input, and the generation parameters are like twists to how the neurons pass information to each other. The token context window is like the capacity of one's working memory. At the conclusion of the last layer of processing, the output tokens are like one's subjective experience.

At the least it's made me think for a moment about `stateless` and its meaning

pyinstallwoes|2 years ago

Your thoughts are just prompts to DeusGPT

sharedbeans|2 years ago

Just because you use some intermediate variables to calculate f(x,y) = x^2 + y^2 doesn't make it a non-pure function. At least at the level of abstraction we're talking about (the API boundary).

The more significant application of storage will be long-term storage wrapped in a read-modify-write loop.