Deal or no deal? Training AI bots to negotiate

[+] paskster|8 years ago|reply

Interesting that the chatbots learned to show "fake" interest in an item, just to conceide it later in the negotiation process.

But I think what is missing, is the time component when negotiating with humans. A negotiation process is usually better for humans if the negotiation is quick and not dragging on too long.

And more importantly the chatbots never seemed to "walk away" from a deal. But in real life, you sometimes have to walk away to show the other party, that you are not a pushover. It would be interesting to enhance the model so that chatbots negotiate repeatedly with each other and "remember" how the other party behaves and how far you can push the other party to concede. Because some negotiations really are zero sum games.

[+] al_chemist|8 years ago|reply

> But in real life, you sometimes have to walk away to show the other party, that you are not a pushover.

AI consider this human behavior a bug.

[+] zitterbewegung|8 years ago|reply

I bet you could make a walking away function with a threshold and that would trigger a iterated prisoner dilemma subroutine.

[+] wlamond|8 years ago|reply

It'd be interesting if the agents developed their own language during the reinforcement learning stage that is unintelligible to humans but allows them to quickly navigate the negotiation. They use the model trained in a supervised way during the reinforcement learning stage to avoid this, but I'm curious to see what the agent learns when paired against another reinforcement learning agent.

Edit: Indeed, the paper says that not using the fixed agent trained on human negotiation leads to unintelligible language from the agents.

[+] EGreg|8 years ago|reply

Can we measure if the language is more efficient at getting deals done?

[+] jakebasile|8 years ago|reply

That really seems like the plot to a Michael Crichton novel. Fascinating.

[+] phreeza|8 years ago|reply

The most interesting thing to me is not the negotiation tactics that the agents learn but the idea of coming up with a more easily quantifiable (and therefore differentiable) quality metric for dialogue tasks.

[+] EGreg|8 years ago|reply

And so it begins.

I am worried when computers start getting better than people at these kinds of things. They already mastered heads-up poker.

Almost all of our systems rely on an inefficiency of an attacker - so they are vulnerable.

[+] visarga|8 years ago|reply

So in a very limited sense these agents have a theory of mind - they can infer the beliefs and goals of their opponents and act accordingly. Agents/objects can be in an exponential number of possible relative positions, but this system factorizes structure from function.

[+] pmontra|8 years ago|reply

It would be interesting to see what happens when new untrained bots start negotiating with trained one. One new language for each pair of bots or one common language that every new bot has to learn?

[+] Zpalmtree|8 years ago|reply

>To prevent the algorithm from developing its own language, it was simultaneously trained to produce humanlike language.

Not sure what this would look like, but I'd be interested to find out.

16 comments