rryan's comments

rryan | 9 months ago | on: The bitter lesson is coming for tokenization

Don't make me tap the sign: There is no such thing as "bytes". There are only encodings. UTF-8 is the encoding most people are using when they talk about modeling "raw bytes" of text. UTF-8 is just a shitty (biased) human-designed tokenizer of the unicode codepoints.

rryan | 1 year ago | on: Transformers Without Normalization

RMSNorm is pretty insigificant in terms of the overall compute in a transformer though -- usually the reduction work can be fused with earlier or later operations.

rryan | 1 year ago | on: Can Gemini 1.5 read all the Harry Potter books at once?

ML 101: Do not evaluate on the training data.

Yes of course it can, because they fit in the context window. But this is an awful test of the model's capabilities because it was certainly trained on these books and websites talking about the books and the HP universe.

rryan | 2 years ago | on: (next Rich)

Thanks for everything, Rich. You inspired me repeatedly.

rryan | 2 years ago | on: AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

This is ... not what I expected. It's basically wiring up pre-trained models to ChatGPT via a router and "modality transformations" (a.k.a speech-to-text and text-to-speech).

I expected it to be a GPT-style model that processes audio directly to perform a ton of speech and maybe speech-text tasks in a zero-shot manner.

rryan | 3 years ago | on: Neurons in a dish learn to play Pong

I agree that a Transformer is an example of a "reflexive" behavior because it learns to react in a context (via gradient descent rather than evolution as the learning algorithm). It's a conditional categorical distribution on steroids.

I also agree it's not much different than what's going on in this petri dish with pong.

But I don't think that's a profound statement.

What I'm saying is that calling what a Transformer does "language development" isn't accurate. A Transformer can't "develop" language in that sense, it can only learn "reflexive" behavior from the data distribution it's trained on (it could never have produced that data distribution itself without the data existing in the first place).

rryan | 3 years ago | on: Neurons in a dish learn to play Pong

There's a huge difference between fitting a probabilistic model to a data distribution then sampling from it (what GPT-3 is) and agents that invent language and use it to communicate.

rryan | 3 years ago | on: Goodbye, Feedly

"I could go pro, but nah" -- I read this and closed the tab. What a whingefest from a free tier user.

The Reddit crawling problem is because Reddit rate limits their crawling so they have to prioritize the most popular feeds. What's the problem with linking your account, or making a dedicated feedly throwaway for crawling?

Been a pro user since the beginning because I want the service to stick around. It works just as well as it always has and I don't mind that they're adding new features even if they aren't for me.

Sheesh.

rryan | 4 years ago | on: Vanced Discontinuation

> The main source of income for youtube isn't ads. YouTube revolves around the merchandise and YouTube Premium subscriptions.

> If you are talking about creators who are not earning money for using vanced, you should know they won't make millions out of those ads.

... right. Feel free to keep telling yourselves that.

rryan | 4 years ago | on: Vanced: YouTube adblocker for Android

Straw man argument. OP was suggesting you pay for YT premium, which does away with the ads completely. Removing your web adblocker is a non sequitur, and exposes your computer to malware / spyware in addition to ads.

rryan | 4 years ago | on: Google increases parental-leave policy to nearly 6 months

Parents are treated equally. If you give birth you get an additional 6 weeks of medical recovery leave. If two parents (any gender) adopt a child then both receive an equal length baby bonding leave.

I've taken parental leave twice at Google and used the full amount. No ill effects, as far as I can tell. Everyone I work with and my management chain was fully supportive.

rryan | 4 years ago | on: WebAssembly: The New Kubernetes?

nit: Kubernetes is the container orchestration, not the container technology itself. So the comparison is apples vs. oranges.

You could replace the containers that are being scheduled by Kubernetes with WebAssembly. Others already linked to Krustlet which is effectively this.

page 1