top | item 44540376

(no title)

On the other hand, we can also diagnose LLM itself: the activation value is their EEG, the gradient is their BOLD - if you are at the cost, you can even calculate their true variational free energy - that is, KL divergence.

"Don't just train your model, understand its mind."

https://github.com/dmf-archive/

discuss

No comments yet.