Might work for you, but if I multi task too much, the quality of my output drops significantly. Where I work, that does not fly. I cannot trust any agent to handle anything without babysitting them to avoid going off the rails - but perhaps the tools I have access to just aren't good (underlying model is claude 4.5, so it the model isn't the cause).I've said this in the past and I'll continue to say it - until the tools get far better at managing context, they will be hard locked for value in most use cases. The moment I see "summarizing conversation" I know I'm about to waste 20 minutes fixing code.
dionian|2 months ago
If I worked on different types of systems with different types of tasks I might feel the same way as you, i think AI works well in specific targeted use cases, where some amount of hallucination can be tolerated and addressed.
What models are you using, I use opus 4.5, which can one shot a surprising ratio of tasks.
fragmede|2 months ago
> so it the model isn't the cause
Thing is, the prompts, those stupid little bits of English that can't possiu matter all that much? It turns out they affect the models performance a ton.