top | item 40648697

Show HN: A cartoon intro to how the attention mechanism works

63 points| ykhli | 1 year ago |ai-explained.yoko.dev | reply

I started to draw cartoons to make complex concepts in AI more accessible to the rest of us. Hope this is useful & lmk if you have ideas or feedback!

12 comments

order
[+] itronitron|1 year ago|reply
while I like the drawings, their combination with the purposefully vague and mysterious/mystical terminology like 'embeddings' and 'attention' make the entire document feel like one of those pamphlets that cults or 'science-based' religions produce
[+] refulgentis|1 year ago|reply
Specific feedback:

- The model is said to be rewarded/penalized based on answers, and the plain meaning would leave someone with the impression of "answer to the question asked/prompt", the visual appears correct (the chalkboard indicates the question is "fill in the next token" and the answer is "the next most likely token")

- You end up with a strong sense there's 3 tables in the model called key, query, and value that carry the weight of the world. Are there only 3?

- "what part of data are we processing and are they relevant?" - what is they? what part of data is it processing? what is data in this context?

- "how well does this data answer my question?" - so the model is picking out answers from the training data and checking if it answers the prompt? This creates a strong sense of copying verbatim from training data

- "how should we improve the contextual representation of the input data?" - "contextual representation of the input data" isn't clear here, in my dummy brain, it's "here's information we can use to decide the next token: in this context, they meant cat cli, not cat the animal"

- "training a model is like doing Q&A with the neutral networks while it attends to the right data": Is it Q&A? If we double down on that (which means also doubling down on the idea being finding the answer in the training data, I don't think it's a good idea), then there's a gap in how the sentence connects within itself that's worth addressing. I assume "like doing Q&A with the neutral networks, and based on whether the answer is correct or not, it'll adjust and learn to attend to the right data"

[+] ykhli|1 year ago|reply
hey! thanks so much for the feedback. I'd actually love to keep updating / iterating these cartoons so they are more approachable. If you have time, I'd love to hear more on which pages are confusing & how I could have explained it better!

I _tried_ to give a definition to embeddings on page 11, but maybe that's not the most intuitive? Lmk! feel free to DM

[+] gumby|1 year ago|reply
> purposefully vague and mysterious/mystical terminology like 'embeddings' and 'attention'

I agree, but this isn't the cartoonist's problem: that's the fancifu vocabulary of the paper's authors.

[+] dennisbabych|1 year ago|reply
Cool idea!

But you should decrease the amount of the text. Because it's hard to watch and hard to read. Also, text fonts should be changed. Look at some cool design examples.

Before you show info via infographics or animations or anything else you have to watch a lot of cool examples and then find one to follow.