top | item 47162978

(no title)

I think it’s funny that at Google I invented and productized next word (and next action) predictor in Gmail and hangouts chat and I’ve never had a single person come to me and ask how this all works.

To me LLMs are incredibly simple. Next word next sentence next paragraph and next answer are stacked attention layers which identify manifolds and run in reverse to then keep the attention head on track for next token. It’s pretty straight forward math and you can sit down and make a tiny LLM pretty easily on your home computer with a good sized bag of words and context

To me it’s baffling everyone goes around saying constantly that not even Nobel prize winners know how this works it’s a huge mystery.

Has anyone thought to ask the actual people like me and others who invented this?

discuss

kosh2|3 days ago

This is like saying quantum mechanics is really simple to understand, all you have to do is find the right formula and plug in the numbers.

When people talk about understanding, they mean as knowing how the underlying mechanism works often by finding an analog in real life.

tsunamifury|3 days ago

It is a sophisticated way of putting your foot in front of you and taking a step while keeping your head up and looking at your destination.

booleandilemma|4 days ago

A lot of people in tech thrive on the mystery and don't like explaining things in simple terms. It makes what they do seem more valuable if no one can understand what they're talking about. At the same time, being vague and mysterious can help hide someone's own misunderstandings. When you speak clearly you need to be accurate, because it's more obvious when you're wrong.

tsunamifury|3 days ago

I agree -- or the math is just way over peoples heads -- even word points to word N times.