Sure but the lower hanging fruit is mostly squeezed, so what else is driving the idea of _job replacement_ if the next branch up of the tree is 3-5 years out? I've seen very little to indicate beyond tooling empowering existing employees a major jump in productivity but nothing close to job replacement (for technical roles). Often times it's still accruing various forms of technical debt/other debts or complexities. Unless these are 1% of nontechnical roles it doesn't make much sense other than their own internal projection for this year in terms of the broader economy. Maybe because they have such a larger ship to turn that they need to actually plan 2-3 years out? I don't get it, I still see people hire technical writers on a daily basis, even. So what's getting cut there?
What exactly would that evidence look like, for you?
It definitely increases some types of productivity (Opus one-shot a visualization that would have likely taken me at least a day to write before, for work) - although I would have never written this visualization before LLMs (because the effort was not worth it). So I guess it's Jevons Paradox in action somewhat.
In order to observe the productivity increases you need a good scale where the productivity would really matter (the same way that when a benchmark is saturated, like the AIME, it stops telling us anything useful about model improvement)
If that's the case I feel like you couldn't actually be using them or paying attention. I'm a big proponent and use LLMs for code and hardware projects constantly but Gemini Pro and ChatGPT 5.2 are both probably the worst state we've seen. 6 months ago I was worried but at this point I have started finding other ways to find answers to things. Going back to the stone tablets of googling and looking at Stackoverflow or reddit.
I still use them but find that more of the time is spent arguing with it and correcting problems with it than actually getting any useful product.
> I still use them but find that more of the time is spent arguing with it and correcting problems with it than actually getting any useful product.
I feel the same. They're better at some things yes, but also worse at other things. And for me, they're worse at my really important use cases. I could spend a month typing prompts into Codex or AntiGravity and still be left holding the bag. Just yesterday I had a fresh prompt and Geminin bombed super hard on some basic work. Insisting the problem was X when it wasn't. I don't know. I was super bullish but now I'm feeling far from sold on it.
mrwaffle|1 month ago
bopbopbop7|1 month ago
medvezhenok|1 month ago
It definitely increases some types of productivity (Opus one-shot a visualization that would have likely taken me at least a day to write before, for work) - although I would have never written this visualization before LLMs (because the effort was not worth it). So I guess it's Jevons Paradox in action somewhat.
In order to observe the productivity increases you need a good scale where the productivity would really matter (the same way that when a benchmark is saturated, like the AIME, it stops telling us anything useful about model improvement)
unknown|1 month ago
[deleted]
chankstein38|1 month ago
I still use them but find that more of the time is spent arguing with it and correcting problems with it than actually getting any useful product.
moshegramovsky|1 month ago
I feel the same. They're better at some things yes, but also worse at other things. And for me, they're worse at my really important use cases. I could spend a month typing prompts into Codex or AntiGravity and still be left holding the bag. Just yesterday I had a fresh prompt and Geminin bombed super hard on some basic work. Insisting the problem was X when it wasn't. I don't know. I was super bullish but now I'm feeling far from sold on it.
miltonlost|1 month ago
chankstein38|1 month ago