(no title)
imatworkyo | 10 months ago
There are many types of complex, and many times complex for a human coder, are trivial for AI and its skillset.
imatworkyo | 10 months ago
There are many types of complex, and many times complex for a human coder, are trivial for AI and its skillset.
gf000|10 months ago
CRUD backend app for a business in a common sector? It's mostly just connecting stuff together (though I would argue that an experienced dev with a good stack takes less time to write it as is than painstakingly explaining it to an LLM in an inexact human language).
Some R&D stuff, or even debugging any kind of code? It's almost useless, as it would require deep reasoning, where these models absolutely break down.
simonw|10 months ago
I have been extremely impressed with o1, o3, o4-mini and Gemini 2.5 as debugging aids. The combination of long context input and their chain-of-thought means they can frequently help me figure out bugs that span several different layers of code.
I wrote about an early experiment with that here: https://simonwillison.net/2024/Sep/25/o1-preview-llm/
Here's a Gemini 2.5 Pro transcript from this afternoon where I'm trying to figure out a very tricky bug: https://gist.github.com/simonw/4e208ab9edb5e6a814d3d23d7570d...
tyre|10 months ago
It thinks of things that I don’t think of right away. It tries weird approaches that are frequently wrong but almost always yield some information and are sometimes spot on.
And sometimes there’s some annoying thing that having Claude bang its head against for $1.25 in API calls is slower than I would be but I can spend my time and emotional bandwidth elsewhere.
expensive_news|10 months ago
But when I try to do more complicated math it falls short. I do have to say that Gemini Pro 2.5 is starting to get better in this area though.