(no title)
rapind | 24 days ago
Some of the things it occasionally does:
- Ignores conventions (even when emphasized in the CLAUDE.md)
- Decides to just not implement tests if gets spins out on them too much (it tells you, but only as it happens and that scrolls by pretty quick)
- Writes badly performing code (N+1)
- Does more than you asked (in a bad way, changing UIs or adding cruft)
- Makes generally bad assumptions
I'm not trying to be overly negative, but in my experience to date, you still need to babysit it. I'm interested though in the idea of using multiple models to have them perform independent reviews to at least flag spots that could use human intervention / review.
vidarh|23 days ago
I don't want to waste my time reviewing a change the model can still significantly improve all by itself. My time costs far more than the models.