We aren’t good at creating software systems from reliable and knowable components. A bit skeptical that the future of software is making a Rube Goldberg machine of black box inter-LLM communication.
I could see software having a future as a Rube Goldberg machine of black box AIs, if hardware is cheap enough and the AIs are good enough. There was a scifi novel (maybe "A Fire Upon the Deep"?) where there was no need to write software because AI could cobble any needed solution together by using existing software and gluing it together. Throwing cycles at deepening layers was also something that Paul Graham talked about in the hundred year language (https://paulgraham.com/hundred.html).
Now, whether hardware is cheap enough or AI is smart enough is an entirely different question...
As someone who makes HW for a living, please do make more Rube Goldberg machines of black box LLMs. At least for a few more years until my kids are out of college. :)
Here's a practical in this vein but much simpler - if you're trying to answer a question with an LLM, and have it answer in json format within the same prompt, for many models the accuracy is worse than just having it answer in plaintext. The reason is that you're now having to place a bet that the distribution of json strings it's seen before meshes nicely with the distribution of answers to that question.
So one remedy is to have it just answer in plaintext, and then use a second, more specialized model that's specifically trained to turn plaintext into json. Whether this chain of models works better than just having one model all depends on the distribution match penalties accrued along the chain in between.
mmcdermott|1 year ago
Now, whether hardware is cheap enough or AI is smart enough is an entirely different question...
dsv3099i|1 year ago
paulddraper|1 year ago
keeganpoppen|1 year ago
TZubiri|1 year ago
pizza|1 year ago
So one remedy is to have it just answer in plaintext, and then use a second, more specialized model that's specifically trained to turn plaintext into json. Whether this chain of models works better than just having one model all depends on the distribution match penalties accrued along the chain in between.