Well check this out. I know you all might hate python + react. But for this platform, it works.
I'd like to open source it..?
This video demonstrates "how" I actually get the LLM to "do stuff faster than you can" (I'm talking about coding/dev work here).
You have to understand what the LLM is, how it works, and manage it properly. You have to give it a "world state", and give it blueprints, and give it tools. Then, you can do this several times in parallel, and watch the magic happen.
It's not a small amount of work - learning to use the LLM is a completely new skill. As many have pointed out it's not like writing code. It's different kind of thinking/management.
But I believe that if we collaborate on creating the new *TOOLS* and *PLATFORMS* together instead of just trying to "use some shitty chat applications" that can't even properly handle context window truncation and document management, then we will eventually succeed in creating a new wave of actual interaction with the LLM through systems that are designed particularly for that process.
This is also an example of "the LLM writing code". I didn't write any of this code for the application being demonstrated. The LLM wrote 100% of the code. I started in chatGPT and then moved onto the custom platform as quickly as possible. So this is a "dogfood" project - i.e. fully feeding the LLM back the code it wrote, with the next desired state indicated by user (me).
Now, the system is so complex and operates so effectively that given the standardized set of initial instructions describing the system to the LLM, the LLM can review the codebase and add docs to the context window, discard those that are unnecessary, write it's own reviews and implementation plans, begin a tasks checklist, and sequentially complete the task, receive feedback for apply-errors and linter failures, execute terminal commands, run tests, etc. This is multi-turn. IT CAN ALSO spawn NEW LLM calls, allow those calls to return in parallel, allow those calls to feed their data and stop their own automation chains, eventually updating the "orchestrator" thread with the "reports" from the agents, etc. And then after a while, it will later on give you a ping and let you know, your $20 on API tokens was well spent - it actually works.
I'm talking about working on codebases here that are large enough to where only say 10% can be included in a context window at a single time, if you don't want to blow past 200k tokens.
I'm talking about where almost any reasonable task is multi-turn/multi-shot, if you want real verification and results and not just some BS.
I'd like to see people start to test platforms like this for other kinds of projects, and tweak the open-source platform itself (through the LLM which is very good at tweaking this kind of platform) to be more aligned with their needs - less popular coding languages, other kinds of environments, ability to do research and create vector DBs and use that data in the process... The sky is the limit really, but we need some kind of real platforms for *doing it all on*. Otherwise we are all just complaining about the corporate money making tools that everyone's being sold thinking they can get shit done with the LLM.
TL:DR; Turns out "LLM chat threads and some shitty MCP server linkages that halfway work + your IDE" is not actually a great recipe for "saving time". But we can build a platform that actually IS the right recipe for saving time and getting accurate results.
lucid-dev|4 months ago
I'd like to open source it..?
This video demonstrates "how" I actually get the LLM to "do stuff faster than you can" (I'm talking about coding/dev work here).
You have to understand what the LLM is, how it works, and manage it properly. You have to give it a "world state", and give it blueprints, and give it tools. Then, you can do this several times in parallel, and watch the magic happen.
It's not a small amount of work - learning to use the LLM is a completely new skill. As many have pointed out it's not like writing code. It's different kind of thinking/management.
But I believe that if we collaborate on creating the new *TOOLS* and *PLATFORMS* together instead of just trying to "use some shitty chat applications" that can't even properly handle context window truncation and document management, then we will eventually succeed in creating a new wave of actual interaction with the LLM through systems that are designed particularly for that process.
This is also an example of "the LLM writing code". I didn't write any of this code for the application being demonstrated. The LLM wrote 100% of the code. I started in chatGPT and then moved onto the custom platform as quickly as possible. So this is a "dogfood" project - i.e. fully feeding the LLM back the code it wrote, with the next desired state indicated by user (me).
Now, the system is so complex and operates so effectively that given the standardized set of initial instructions describing the system to the LLM, the LLM can review the codebase and add docs to the context window, discard those that are unnecessary, write it's own reviews and implementation plans, begin a tasks checklist, and sequentially complete the task, receive feedback for apply-errors and linter failures, execute terminal commands, run tests, etc. This is multi-turn. IT CAN ALSO spawn NEW LLM calls, allow those calls to return in parallel, allow those calls to feed their data and stop their own automation chains, eventually updating the "orchestrator" thread with the "reports" from the agents, etc. And then after a while, it will later on give you a ping and let you know, your $20 on API tokens was well spent - it actually works.
I'm talking about working on codebases here that are large enough to where only say 10% can be included in a context window at a single time, if you don't want to blow past 200k tokens.
I'm talking about where almost any reasonable task is multi-turn/multi-shot, if you want real verification and results and not just some BS.
I'd like to see people start to test platforms like this for other kinds of projects, and tweak the open-source platform itself (through the LLM which is very good at tweaking this kind of platform) to be more aligned with their needs - less popular coding languages, other kinds of environments, ability to do research and create vector DBs and use that data in the process... The sky is the limit really, but we need some kind of real platforms for *doing it all on*. Otherwise we are all just complaining about the corporate money making tools that everyone's being sold thinking they can get shit done with the LLM.
TL:DR; Turns out "LLM chat threads and some shitty MCP server linkages that halfway work + your IDE" is not actually a great recipe for "saving time". But we can build a platform that actually IS the right recipe for saving time and getting accurate results.