(no title)
lsy | 7 months ago
Despite the feeling that it's a fast-moving field, most of the differences in actual models over the last years are in degree and not kind, and the majority of ongoing work is in tooling and integrations, which you can probably keep up with as it seems useful for your work. Remembering that it's a model of text and is ungrounded goes a long way to discerning what kinds of work it's useful for (where verification of output is either straightforward or unnecessary), and what kinds of work it's not useful for.
crystal_revenge|7 months ago
I also have not experienced the post's claim that: "Generative AI has been the fastest moving technology I have seen in my lifetime." I can't speak for the author, but I've been in this field from when "SVMs are the new hotness and neural networks are a joke!" to the entire explosion of deep learning, and insane number of DL frameworks around the 20-teens, all within a decade (remember implementing restricted Boltzmann machines and pre-training?). Similarly I saw "don't use JS for anything other than enhancing the UX" to single page webapps being the standard in the same timeframe.
Unless someone's aim is to be on that list of "High signal" people, it's far better to just keep your head down until you actually need these solutions. As an example, I left webdev work around the time of backbone.js, one of the first attempts at front end MVC for single pages apps. Then the great React/Angular wars began, and I just ignored it. A decade later I was working with a webdev team and learned React in a few days, very glad I did not stress about "keeping up" during the period of non-stop changing. Another example is just 5 years ago everyone was trying to learn how to implement LSTMs from scratch... only to have that model essentially become obsolete with the rise of transformers.
Multiple times over my career I've learned lesson that moving fast is another way of saying immature. One would find more success learning about the GLM (or god forbid understanding to identify survival analysis problems) and all of it's still under appreciated uses for day-to-day problem solving (old does not imply obsolete) than learning the "prompt hack of the week".
megh-khaire|7 months ago
However, this AI wave does feel a bit different. What stands out is the speed of progress in multiple directions. We’ve seen new model architectures, prompting techniques, and agent frameworks. And every time one of those advances, it opens up new possibilities that startups are quick to explore.
I’m with you that chasing every shiny thing isn’t practical or even useful most of the time. But as someone curious about the space, I still find it exciting.
thorum|7 months ago
- Someone made a slightly different tool for using LLMs (may or may not be useful depending on whether existing tools meet your needs)
- Someone made a model that is incrementally better at something, beating the previous state-of-the-art by a few % points on one benchmark or another (interesting to keep an eye on, but remember that this happens all the time and this new model will be outdated in a few months - probably no one will care about Kimi-K2 or GPT 4.1 by next January)
I think most people can comfortably ignore that kind of news and it wouldn’t matter.
On the other hand, some LLM news is:
- Someone figured out how to give a model entirely new capabilities.
Examples: RL and chain of thought. Coding agents that actually sort of work now. Computer Use. True end-to-end multimodal modals. Intelligent tool use.
Most people probably should be paying attention to those developments (and trying to look forward to what’s coming next). But the big capability leaps are rare and exciting enough that a cursory skim of HN posts with >500 points should keep you up-to-date.
I’d argue that, as with other tech skills, the best way to develop your understanding of LLMs and their capabilities is not through blogs or videos etc. It’s to build something. Experience for yourself what the tools are capable of, what does and doesn’t work, what is directly useful to your own work, etc.
PaulHoule|7 months ago
A lot of people are feeling HN is saturated with AI posts whether it is how MCP is like USB-C (repeated so much you know it is NPCs) or how outraged people are that their sh1t fanfics are being hoovered up to train AI.
This piece is not “news”, it’s a summary which is tepid at best, I wish people had some better judgement about what they vote up.
pyman|7 months ago
1. Stop living other people's experiences. Start having your own.
2. Stop reading blogs. Start building apps.
3. Everyone's experience depends on their use case or limitations. Don't follow someone's opinion or ideology without understanding why.
4. Don't waste time chasing employees or researchers on Twitter or Substack. Most of them are just promoting themselves or their company.
5. Don't let anxiety or FOMO take over your time. Focus on learning by doing. If something important comes out, you'll find out eventually.
6. Being informed matters, but being obsessed with information doesn't. Be smart about how you manage your time.
That's what I tell them.
godelski|7 months ago
I think it is easy for it to feel like the field is moving fast while it actually isn't. But I learned a lesson where I basically lost a year when I had to take care of my partner. I thought I'd be way behind when coming back but really not much had changed.
I think gaining this perspective can help you "keep up". Even if you are having a hard time now, this might suggest that you just don't have enough depth yet. Which is perfectly okay! Just might encourage you to focus on different things so that you can keep up. You can't stay one step behind if you first don't know how to run. Or insert some other inspirational analogy here. The rush is in your head, not in reality.
alphazard|7 months ago
The minutiae of how next token prediction works is rarely appreciated by lay people. They don't care about dot products, or embeddings, or any of it. There's basically no advantage to explaining how that part works since most people won't understand, retain, or appreciate it.
Melonololoti|7 months ago
And then you have GenAI like flux and all the open source projects.
I think it's beneficial to get all of that and then keeping an eye on it to catch the moment when it becomes relevant for you and not being surprised and too late.
gammalost|7 months ago
Why? When you think you might need something just search for it. There are too many models with incremental improvements
helloplanets|7 months ago
nerdsniper|7 months ago
I maintain a funnel sucking up all the PR stuff — but I skip straight to the papers, benchmarks, and githubs.
qsort|7 months ago
Last week I showed some colleagues how to do some basic things with Claude Code and they were like "wow, I didn't even know this existed". Bro, what are you even doing.
There is definitely a lot of hype and the lunatics on Linkedin are having a blast, but to put it mildly I don't think it's a bad investment to experiment a bit with what's possible with the SOTA.
crystal_revenge|7 months ago
The trouble is that the advice in the post will have very little impact on "understanding how LLMs work". The number of people who talk about LLMs daily but have never run an LLM local, and certainly never "opened it up to mess around" is very large.
A fun weekend exercise that anyone can do is to implement speculative decoding[0] using local LLMs. You'll learn a lot more about how LLMs work than reading every blog/twitter stream mentioned there.
0. https://research.google/blog/looking-back-at-speculative-dec...
layer8|7 months ago
That’s a nice way to put it, made me chuckle. :)
chamomeal|7 months ago
It is ridiculously cool, but I think anybody developer who is out of the loop could easily get back into the loop at any moment without having to stay caught up most of the time.
bravesoul2|7 months ago
panarchy|7 months ago
victorbjorklund|7 months ago