top | item 46511978

(no title)

jeeeb | 1 month ago

Looking through this guys GitHub he seems to have a lot of small “demo” apps, so I’m not surprised he gets a lot of value out of LLM tools.

Modern LLMs are amazing for writing small self contained tools/apps and adding isolated features to larger code bases, especially when the problem can be solved by composing existing open source libraries.

Where they fall flat is their lack of long term memory and inability to learn from mistakes and gain new insider knowledge/experience over time.

The other area they seem to fall flat is that they seem to rush to achieve their immediate goal and tick functional boxes without considering wider issues such as security, performance and maintainability. I suspect this is an artefact of the reinforcement learning process. It’s relatively easy to asses whether a functional outcome has been achieved, while assessing secondary outcomes (is this code secure, bug free, maintainable and performant) is much harder.

discuss

tin7in|1 month ago

Peter's (author) last project is reusing a lot of these small libraries as tools in a way larger project. Long term memory is part of that too.

It's an assistant building itself live on Discord. It's really fun to watch.

https://github.com/clawdbot/clawdbot/

mleo|1 month ago

I somewhat disagree. Sure, if the prompt is “build fully functional application that does X from scratch”, then of course you are going to get crap end product because of what you said and didn’t say.

As a developer you would take that and break it down to a design and smaller tasks that can show incremental progress and give yourself a chance to build feature Foo, assess the situation and refactor or move forward with feature Bar.

Working with an LLM to build a full featured application is no different. You need to design the system and break down the work into smaller tasks for it to consume and build. It and you can verify the completed work and keep track of things to improve and not repeat as it moves forward with new tasks.

Keeping fully guard rails like linters, static analysis, code coverage further helps ensure what is produced is better code quality. At some point are you baby sitting the LLM so much that you could write it by hand? Maybe, but I generally think not. While I can get deeply intense and write lots of code, LLMs can still generate code and accompanying documentation, fix static analysis issues and write/run the unit tests without taking breaks or getting distracted. And for some series of tasks, it can do them in parallel in separate worktrees further reducing the aggregate time to complete.

I don’t expect a developer to build something fully without incrementally working on it with feedback, it is not much different with an LLM to get meaningful results.