(no title)
zzbzq
|
1 year ago
It's the other way around. The model is impeccable at "understanding text." It's a gigantic mathematical spreadsheet that quantifies meaning. The model probably "understands" better than any human ever could. Running that backwards into producing new text is where it gets hand-wavy & it becomes unclear if the generative algorithms are really progressing on the same track that humans are on, or just some parallel track that diverges or even terminates early.
nottorp|1 year ago
ben_w|1 year ago
The precise mechanism LLMs use for reaching their probability distributions is why they are able to pass most undergraduate level exams, whereas the Markov chain projects I made 15-20 years ago were not.
Even as an intermediary, word2vec had to build a space in which the concept of "gender" exists such that "man" -> "woman" ~= "king" -> "queen".
gs17|1 year ago
A Markov chain (only using the probabilities of word orders from sentences in its training set) could never output a command that wasn't stitched together from existing ones (i.e. it would always output a valid command name, but if no one had requested a reminder for a date in 2026 before it was trained, it would never output that year). No amount of documents saying "2026 is the year after 2025" would make a Markov chain understand that fact, but LLMs are able to "understand" that.