top | item 37617789

(no title)

This is simplified a bit - It's just a "machine" that maps [set of inputs] -> [set of probabilities of the next output]

First you define a list of tokens - lets say 24 letters because that's easier.

They are a machine that takes an input sequence of tokens, does a deterministic series of matrix operations, and outputs what is a list of the probability of every token.

"learning" is just the process of setting some of the numbers inside of a matrix(s) used for some of the operations.

Notice that there's only a single "if" statement in their final code, and it's for evaluating the result's accuracy. All of the "logic" is from the result of these matrix operations.

discuss

No comments yet.