top | item 33635388

(no title)

jvm | 3 years ago

Author here, yeah I agree the real world is very complicated. For any given task I'm imagining something like

1. Use some sort of instruction tuning to get the thing "good enough" that it gives decent results 75% of the time and the other 25% a human has to take over. 2. Use the actual usage data as training input. Punish bad behaviors and show the model what the human did to solve the problem. 3. Use this training loop to progressively have the model take over a larger % of the time.

…and I think if you can't get (1) good enough to be worth using it's going to be really hard to get the loop going.

discuss

order

huijzer|3 years ago

> Use some sort of instruction tuning to get the thing "good enough" that it gives decent results 75% of the time and the other 25% a human has to take over.

How does the model know when a human has to take over?

I think most extrapolations of current "AI" capabilities into future capabilities are fun and useful in some ways, but also doomed to fail. It's very easy to miss a tiny detail which may in practice be a fundamental problem.

> Use the actual usage data as training input.

Given that those bigger state-of-the-art models train on terabytes of data, how would you know how much training data to generate to sufficiently change the output?

My understanding of "AI" is that it's mostly about some very complex models which are capable of solving previously unsolvable problems. However, those problems are always extremely specific. Going the other way of thinking of problems or future possibilities first and then applying "AI" to it is likely to fail.

notahacker|3 years ago

Much of the time knowing the human has to take over isn't one of the more difficult problems: the AI can't map the user input to a possible continuation with any high probability, or the AI interprets the user input as an expression of frustration or an assertion it's wrong.

The challenge is when AI has to interpret questions about stuff which can be expressed in syntactically similar ways with very different or precisely opposite meanings so it's very confidently (and plausibly) wrong about stuff like price changes and tax, event timings, refunds etc.

twaw|3 years ago

AI should be watched by a AI critic (or a AI guard), which goal is to detect harmful, dangerous, stupid, surprising behavior and raise alarm.

For example, image generators are watched for NSFW content by a separate AI critic.

svnt|3 years ago

> How does the model know when a human has to take over?

It’s incredibly easy, you ask “did this answer solve your issue?” and add a max_tries.

> … how do you know how much training data to generate …?

You don’t, you keep doing it until the results improve to meet your goals, or they stop short and you switch tactics.

TuringTest|3 years ago

Main problem is that most new commercial AI systems usually aren't really designed to let the human take over. There's a closed loop with a front-up designed experience, and the only option in case of failure is to stop using the system.

There's no recourse to manual adjustment of system behavior by the user. Research articles should insist on including the user in the loop from the beginning.

svnt|3 years ago

> Main problem is that most new commercial AI systems usually aren't really designed to let the human take over.

The ones in business use absolutely are.