(no title)
mattlangston | 18 days ago
One data point for this thread: the jump from Opus 4.5 to 4.6 is not linear. The minor version number is misleading. In my daily work the capability difference is the largest single-model jump I've experienced, and I don't say that casually — I spent my career making precision measurements.
I keep telling myself I should systematically evaluate GPT-5.3 Codex and the other frontier models. But Opus is so productive now that I can't justify the time. That velocity of entrenchment is itself a signal, and I think it quietly supports the author's thesis.
I'm not a doomer — I'm an optimist about what prepared individuals and communities can do with this. But I shared this article with family and walked them through it in detail before I ever saw it on HN. That should tell you something about where I think we are.
diminish|17 days ago
the real challenge will be in the frontier of the human knowledge and whether llms will be able to advance things forward or not.
ps1; i'm using 5.3/o4.6/k2.5/m2.5/glm5 and others daily for development - so my work has 1.5x intensified - i tackle increasingly harder problems but llms still really fail big in brand new challenges like i fail too. so i'm more alert than ever.
ps2: syntactical autocomplete used to write 80% of my code; now llms replaced autocomplete but at a semanticlevel; i think and LLM implements most of my actions like a cerebellum for muscle coordination; but sometimes teaching me new info from the net.
mattlangston|17 days ago
That's where the 4.5->4.6 jump hit me hardest - not routine tasks but problems where I need the model to reason about stuff it hasn't seen. It still fails, but it went from confidently wrong to productively wrong, if that makes sense. I can actually steer it now.
The cerebellum analogy resonates. I'd go further - it's becoming something I think out loud with, which is changing how I approach problems, not just how fast I solve them.
jakobnissen|17 days ago
nicwolff|16 days ago