We're not developing new algorithms all that quickly. My point is that one shouldn't dismiss criticism out-of-hand, just because some critics of some other thing turned out to be wrong: for this point to be valid, I don't need to be making criticism. On an unrelated noteā¦
Personally, I'd be referring to the family of algorithms that purely take as input a context window and provide as output a prediction of the next token likelihood. (Plus or minus iteration, to generate strings of text.) Pejoratively, one might call these "fancy Markov chains", though as with most pejoratives, that's overly reductive.
All the approaches we're seeing marketed heavily are just fancy Markov chains. I expect every "new algorithm" for the next 5 years at least to be a fancy Markov chain, because that's what I expect to get funding. (I do expect that some people will be working on other approaches, but only for amateurish reasons.)
These are fancy Markov chains in the sense that humans are just chemicals and computers just do math. Technically true, but not even "overly reductive"; it is just wrong if it is used to imply that, e.g., humans just swirl around in beakers or the most complex thing you can do with computers is trigonometry.
You can make anything sound unimpressive if you describe it sufficiently poorly.
And: So many different variations are published every month. There are a good number of people in serious research trying approaches that don't use cross entropy loss (ie, strictly next-token prediction).
I don't know what the trajectory of the technology is over the next ten years, but I am positive no one else does either and anyone who thinks they do is wrong.
Strictly applying the definition, the entire universe is a Markov chain (thanks to quantum discretization!) People who use "Markov chain" as a pejorative are just idiots.
wizzwizz4|2 years ago
Personally, I'd be referring to the family of algorithms that purely take as input a context window and provide as output a prediction of the next token likelihood. (Plus or minus iteration, to generate strings of text.) Pejoratively, one might call these "fancy Markov chains", though as with most pejoratives, that's overly reductive.
All the approaches we're seeing marketed heavily are just fancy Markov chains. I expect every "new algorithm" for the next 5 years at least to be a fancy Markov chain, because that's what I expect to get funding. (I do expect that some people will be working on other approaches, but only for amateurish reasons.)
SEGyges|2 years ago
You can make anything sound unimpressive if you describe it sufficiently poorly.
And: So many different variations are published every month. There are a good number of people in serious research trying approaches that don't use cross entropy loss (ie, strictly next-token prediction).
I don't know what the trajectory of the technology is over the next ten years, but I am positive no one else does either and anyone who thinks they do is wrong.
oasisaimlessly|2 years ago