(no title)
dev_throwaway | 11 months ago
The anthropomorphization of llms bother me, we don't need to pretend they are alive and thinking, at best that is marketing, at worst, by training the models to output human sounding conversations we are actively taking away the true potential these models could achieve by being ok with them being "simply a tool".
But pretending that they are intelligent is what brings in the investors, so that is what we are doing. This paper is just furthering that agenda.
Philpax|11 months ago
This is not true. The key-values of previous tokens encode computation that can be accessed by attention, as mentioned by colah3 here: https://news.ycombinator.com/item?id=43499819
You may find https://transformer-circuits.pub/2021/framework/index.html useful.
dev_throwaway|11 months ago
The whitepaper you linked is a great one, I was all over it a few years back when we built our first models. It should be recommended reading for anyone interested in CS.
kazinator|11 months ago
Anthropo language has been woven into AI from the early beginnings.
AI programs were said to have goals, and to plan and hypothesize.
They were given names like "Conniver".
The word "expert system" anthropomorphizes! It's literally saying that some piece of logic programming loaded with a base of rules and facts about medical diagnosis is a medical expert.