(no title)
jvm | 3 years ago
1. Use some sort of instruction tuning to get the thing "good enough" that it gives decent results 75% of the time and the other 25% a human has to take over. 2. Use the actual usage data as training input. Punish bad behaviors and show the model what the human did to solve the problem. 3. Use this training loop to progressively have the model take over a larger % of the time.
…and I think if you can't get (1) good enough to be worth using it's going to be really hard to get the loop going.
huijzer|3 years ago
How does the model know when a human has to take over?
I think most extrapolations of current "AI" capabilities into future capabilities are fun and useful in some ways, but also doomed to fail. It's very easy to miss a tiny detail which may in practice be a fundamental problem.
> Use the actual usage data as training input.
Given that those bigger state-of-the-art models train on terabytes of data, how would you know how much training data to generate to sufficiently change the output?
My understanding of "AI" is that it's mostly about some very complex models which are capable of solving previously unsolvable problems. However, those problems are always extremely specific. Going the other way of thinking of problems or future possibilities first and then applying "AI" to it is likely to fail.
notahacker|3 years ago
The challenge is when AI has to interpret questions about stuff which can be expressed in syntactically similar ways with very different or precisely opposite meanings so it's very confidently (and plausibly) wrong about stuff like price changes and tax, event timings, refunds etc.
twaw|3 years ago
For example, image generators are watched for NSFW content by a separate AI critic.
svnt|3 years ago
It’s incredibly easy, you ask “did this answer solve your issue?” and add a max_tries.
> … how do you know how much training data to generate …?
You don’t, you keep doing it until the results improve to meet your goals, or they stop short and you switch tactics.
TuringTest|3 years ago
There's no recourse to manual adjustment of system behavior by the user. Research articles should insist on including the user in the loop from the beginning.
svnt|3 years ago
The ones in business use absolutely are.