top | item 45788468

(no title)

gchadwick | 3 months ago

Karpathy's contribution to teaching around deep learning is just immense. He's got a mountain of fantastic material from short articles like this, longer writing like https://karpathy.github.io/2015/05/21/rnn-effectiveness/ (on recurrent neural networks) and all of the stuff on YouTube.

Plus his GitHub. The recently released nanochat https://github.com/karpathy/nanochat is fantastic. Having minimal, understandable and complete examples like that is invaluable for anyone who really wants to understand this stuff.

discuss

kubb|3 months ago

I was slightly surprised that my colleagues, who are extremely invested in capabilities of LLMs, didn’t show any interest in Karpathy’s communication on the subject when I recommended it to them.

Later I understood that they don’t need to understand LLMs, and they don’t care how they work. Rather they need to believe and buy into them.

They’re more interested in science fiction discussions — how would we organize a society where all work is done by intelligent machines — than what kinds of tasks are LLMs good at today and why.

Al-Khwarizmi|3 months ago

What's wrong or odd about that? You can like a technology as a user and not want to delve into how it works (sentence written by a human despite use of "delve"). Everyone should have some notions on what LLMs can or cannot do, in order to use them successfully and not be misguided by their limitations, but we don't need everyone to understand what backpropagation is, just as most of us use cars without knowing much about how an internal combustion engine works.

And the issue you mention in the last paragraph is very relevant, since the scenario is plausible, so it is something we definitely should be discussing.

CuriouslyC|3 months ago

I think there are a lot of people who just don't care about stuff like nanochat because it's exclusively pedagogical, and a lot of people want to learn by building something cool, not taking a ride on a kiddie bike with training wheels.

tanepiper|3 months ago

If everyone had to understand how carburettors, engines and break systems work; to be able to drive a car - rather than just learn to drive and get from A to B - I'm guessing there would be a lot less cars on the road.

(Thinking about it, would that necessarily be a bad thing...)

danielbln|3 months ago

I'm personally very interested in how LLMs work under the hood, but I don't think everyone who uses them as tools needs that. I don't know the wiring inside my drill, but I know how to put a hole in my wall and not my hand regardless.

miki123211|3 months ago

Not everybody who drives a car (even as a professional driver) knows how to make one.

If you live in a world of horse carriages, you can be thinking about what the world of cars is going to be like, even if you don't fully understand what fuel mix is the most efficient or what material one should use for a piston in a four-stroke.

android521|3 months ago

Do you go deep into molecular biology to see how it works , it is much more interesting and important

amelius|3 months ago

But the question is if you have a better understanding of LLMs from a user's perspective, or they.

arisAlexis|3 months ago

Obviously they are more focused on making something that works

teiferer|3 months ago

Which is terrible. That's the root of all the BS around LLMs. People lacking understanding of what they are and ascribing capabilities which LLMs just don't have, by design. Even HN discussions are full of that. Even though this page literally has "hacker" in its name.

throwaway290|3 months ago

And to all the LLM heads here, this is his work process:

> Yesterday I was browsing for a Deep Q Learning implementation in TensorFlow (to see how others deal with computing the numpy equivalent of Q[:, a], where a is an integer vector — turns out this trivial operation is not supported in TF). Anyway, I searched “dqn tensorflow”, clicked the first link, and found the core code. Here is an excerpt:

Notice how it's "browse" and "search" not just "I asked chatgpt". Notice how it made him notice a bug

stingraycharles|3 months ago

First of all, this is not a competition between “are LLMs better than search”.

Secondly, the article is from 2016, ChatGPT didn’t exist back then

confirmmesenpai|3 months ago

what you did here is called confirmation bias.

> I think congrats again to OpenAI for cooking with GPT-5 Pro. This is the third time I've struggled on something complex/gnarly for an hour on and off with CC, then 5 Pro goes off for 10 minutes and comes back with code that works out of the box. I had CC read the 5 Pro version and it wrote up 2 paragraphs admiring it (very wholesome). If you're not giving it your hardest problems you're probably missing out.

https://x.com/karpathy/status/1964020416139448359