I can’t find the link to the paper right now, but after reading about how LLMs perform better with task breakdowns, I vastly improved my integrations by having ChatGPT generate prompts that decompose a general task into a series of tasks based on a sample input and output. I haven’t needed to make a self-refining system (one or two rounds of task decomposition and refinement resulted in the expected result for all inputs), but I would assume this is fairly trivial and that AIs can do it better than humans.
This is also an area where I expect OpenAI will continue to demolish the competition. The ability to recursively generate and process large prompts is truly nuts. I tried swapping in some of the “high-performing” LLama models and they all choked on anything more than a paragraph.
Capable enough LLMs are human level for lots of things. Reinforcement learning from ai feedback is a thing (the anthropic claude models use that). Strictly speaking, it's not necessary to have humans in the loop for a lot of these things.
Some are hesitant to admit we've created human level general intelligence but saying otherwise doesn't really hold up to scrutiny.
I see people saying things like this but I have yet to see anyone show data for a non-trivial workflow with human-level accuracy over a wide range of inputs, without a human in the loop.
Headlines like this hint towards AI improving itself. Prompting itself in this case. But as we see in reinforcement learning, algorithms that act and improve themselves are not new. The interesting thing will be weather or not they eventually "collapse".
For example, if an RL algorithms is performing well on an Atari game, you can stop the training and just let the agent run for years and the performance will remain about the same. However, if you allow the agent to continue training, it's not clear whether it will (1) continue improving, (2) stay about the same, or (3) collapse and perform much worse and never recover. I'm not an RL expert, but I've spent a lot of time experimenting and implementing the algorithms myself and I've seen all 3 of these scenarios play out, and I'm never quite sure what's going to happen so long as I allow the training to continue.
GTP4 will remain GTP4 forever, and that's amazing, but just because GTP4 is stable and amazing while it's not in training mode, doesn't mean it will remain stable if we allow it to bootstrap and prompt itself and prepare its own training data, etc.
But in seriousness - language models may be scaling in sophistication exponentially with time, but software engineering problems scale in complexity (on average) exponentially with lines of code. The base of this exponential function isn't large, but it's more than 1.
In the end there's a need for someone who understands what they're doing.
Personally, I use ChatGPT to discover libraries that solve my problems and the ~70% success ratio that I'm seeing with this is enough for me for now.
[+] [-] bugglebeetle|2 years ago|reply
This is also an area where I expect OpenAI will continue to demolish the competition. The ability to recursively generate and process large prompts is truly nuts. I tried swapping in some of the “high-performing” LLama models and they all choked on anything more than a paragraph.
[+] [-] axiom92|2 years ago|reply
[+] [-] bravogamma|2 years ago|reply
[+] [-] famouswaffles|2 years ago|reply
Some are hesitant to admit we've created human level general intelligence but saying otherwise doesn't really hold up to scrutiny.
[+] [-] lukasb|2 years ago|reply
[+] [-] macrolocal|2 years ago|reply
[+] [-] Buttons840|2 years ago|reply
For example, if an RL algorithms is performing well on an Atari game, you can stop the training and just let the agent run for years and the performance will remain about the same. However, if you allow the agent to continue training, it's not clear whether it will (1) continue improving, (2) stay about the same, or (3) collapse and perform much worse and never recover. I'm not an RL expert, but I've spent a lot of time experimenting and implementing the algorithms myself and I've seen all 3 of these scenarios play out, and I'm never quite sure what's going to happen so long as I allow the training to continue.
GTP4 will remain GTP4 forever, and that's amazing, but just because GTP4 is stable and amazing while it's not in training mode, doesn't mean it will remain stable if we allow it to bootstrap and prompt itself and prepare its own training data, etc.
[+] [-] m3kw9|2 years ago|reply
[+] [-] jahewson|2 years ago|reply
[+] [-] skybrian|2 years ago|reply
[+] [-] kaesar14|2 years ago|reply
[+] [-] drooby|2 years ago|reply
The future I see is that everyone is about to become a CEO with a personal assistant that can run a business.
So I'm going to start building something of my own starting now.
[+] [-] Tade0|2 years ago|reply
But in seriousness - language models may be scaling in sophistication exponentially with time, but software engineering problems scale in complexity (on average) exponentially with lines of code. The base of this exponential function isn't large, but it's more than 1.
In the end there's a need for someone who understands what they're doing.
Personally, I use ChatGPT to discover libraries that solve my problems and the ~70% success ratio that I'm seeing with this is enough for me for now.
[+] [-] rapind|2 years ago|reply
[+] [-] riku_iki|2 years ago|reply
[+] [-] low_tech_punk|2 years ago|reply
[+] [-] kaesar14isazero|2 years ago|reply
[deleted]
[+] [-] gandalfgeek|2 years ago|reply
https://youtu.be/oXtZAUPGHn8
[+] [-] opengears|2 years ago|reply
[deleted]