Interestingly Shannon did write about entropy relating to the English language, and how given a sequence of tokens, the next token can be predicted using the probabilities of finding that token after a certain sequence in other bodies of text: http://medientheorie.com/doc/shannon_redundancy.pdf
This is from 1950. I wonder what he would have to say about today’s LLMs.
No comments yet.