AI Learns to Play CS:Go via Large-Scale Behavioural Cloning

[+] _hark|3 years ago|reply

This is basically a re-implementation of the recent work from OpenAI[1], where you first collect high-quality observation-action-observation transition tuples, train a model to predict the actions between observations, then use that trained model to label a much larger observation-only observation dataset (e.g. youtube videos of people playing a game).

Finally, you just do dead-simple behavior cloning (predict "expert" action from current observation) on the large labeled dataset. It's still surprising to me how well this works! Behavior cloning has some theoretical issues since it doesn't address the sequential nature of decision problems at all, but apparently with enough scale it can still do fairly well - super cool.

[1]: https://openai.com/blog/vpt/

[+] jackosdev|3 years ago|reply

What they did with Dota 2 was really surprising, the AI did a technique where if it was ahead it would go behind the enemy tower to get 100% of the gold while the enemy would struggle with last hits under the tower and take creep damage. I'd never seen anyone do this before and pro's started copying this after. The way I remember them explaining it was that the AI has something it's trying to maximize for, and it plays millions of games trying slightly different variations which can lead this kind of creativity, pretty wild.

[+] sandebert|3 years ago|reply

Gameplay examples, found in the PDF:

https://youtu.be/KTY7UhjIMm4

[+] pprotas|3 years ago|reply

Fascinating! You can see it is trying to mimic many gameplay tactics used by CS:GO players, but it's just barely missing the mark. It's like a monkey trying to mimic human behaviour.

Examples:

- Follows the wall contour using the crosshairs when moving around the corners. The only thing wrong here is that the AI is aiming at the wall instead of at the open space next to the wall where the enemy would be.

- After going through a door, it checks all corner to clear the room. But it's very slow in doing this, and would get shot instantly in a real game.

- Adjusts the crosshair constantly to not look at the floor when walking. The problem is that the crosshair is never at headshot level!

I am not bashing on this AI btw, just pointing out that it's actually getting close to real human behaviour! Fascinating!

Something I've noticed is going well is the fact that the AI doesn't spray n' pray but controls the spray pattern and drags the crosshair down. Shoots in bursts!

[+] snird|3 years ago|reply

The AI ignores enemy players running behind him. Because he can't see them.

I'd love to see the same method, but with audio inputs too.

[+] malikNF|3 years ago|reply

Wow it even learned to spawn camp lol

[+] smrtinsert|3 years ago|reply

Excellent. Believably plays like a pro with regard to spray patterns but maybe slightly slower. This is doom for any cheat detect that relies on outliers. Doesn't look aimbotty in the slightestt.

[+] diarrhea|3 years ago|reply

Interesting. What's stopping the AI from more or less instantly detecting human shapes and then dishing out headshots instantly as well? Technically, that should be possible. I suppose its learning material is strictly actual humans, so it'd never be able to learn that, even if technically feasible.

[+] Stevvo|3 years ago|reply

"matches the performance of the medium difficulty built-in AI on the deathmatch game mode"

You might not realize how bad that actually is. Built in bots on medium difficulty can't even challenge a complete beginner.

[+] munro|3 years ago|reply

kinda a missed opportunity they scaled the whole screen down to 280x150. They should have left the cross hair area larger--so like 10x10 around the crosshair unscaled, and instead of doing weird warping geometry, just pass it in as a second 2d image. I'm guessing the focus of the research is on the behavior cloning, which is really awesome!

> Whilst RL research often aims to maximise reward, we emphasise that this is not the exclusive ob- jective of this paper – perfect aim can be achieved through simple geometry and accessing backend information about enemy locations (hacks and built-in bots exploit this). Rather, we aim to produce an agent that plays in a humanlike fashion, that is fun and challenging to play with and against.

love it!

https://github.com/TeaPearce/Counter-Strike_Behavioural_Clon...

Also so awesome, I've been wondering how people automate games. it looks super easy & simple, no longer a big mystery :D

[+] LanceH|3 years ago|reply

I both do and don't want it to include the use of chat.

[+] skizm|3 years ago|reply

I'm imagining it learning to use the chat to tilt other players. Like it just ends up typing "ez" in chat after every kill or something.

[+] jerpint|3 years ago|reply

« Pwn the hum4n n00b »

[+] zitterbewegung|3 years ago|reply

0.68 K/D ratio to start off with is pretty good against a real person.

[+] solarpunk|3 years ago|reply

Wasn't Google Stadia's weird AI latency improvement system based off of something similar?

[+] culopatin|3 years ago|reply

I’m very ignorant when it comes to AI but I’d like to have a machine watch me select some sub strings out of different sequences of text and have it find what is the pattern behind all the choices I make. Is there something like that I can train?

[+] hbrav|3 years ago|reply

That sounds a bit like the SQuAD 2.0 challenge, but simpler.

In SQuAD 2.0 a model is given a question, and a passage of information. It has to select the start and end points of where the answer is contained in the passage.

Your problem is simpler in that there is no question - you just have to have a model that can identify the start and end positions that _you_ will select.

[+] flooo|3 years ago|reply

Sounds like excel flash fill might be similar to what you’re interested in. It’s a well documented feature described in papers and talks, it comes from a pretty cool group in MS research.

[+] snehesht|3 years ago|reply

Code: https://github.com/TeaPearce/Counter-Strike_Behavioural_Clon...

[+] rldjbpin|3 years ago|reply

it has already been a couple days, but here is my $0.02 (for context, i have played cs:go for 3k+ hours):

this game has much more going on than games like mario, and even dota 2. for one, audio matters! also, there are more inputs to train, such as walk, jump and crouch.

this is a great start, but a lot more needs to be done before it can be called a complete solution.

[+] whymauri|3 years ago|reply

I've always wanted to train a CS:GO AI/bot but am terrified of VAC bans.

[+] lbotos|3 years ago|reply

Private servers? FF/DM?

[+] sylware|3 years ago|reply

Is this related to the undetectable AI based AIM bot?

[+] sevensevenseven|3 years ago|reply

what could possibly go wrong ? lol, rip mankind

[+] raydiatian|3 years ago|reply

This is starting to feel like a medium takeoff.

[+] KatrKat|3 years ago|reply

> Hey what should we train our AI to play? Cities: Skylines so it can fix traffic?

> Nah bro, let's teach it to shoot people in CS:Go!

[+] lob_it|3 years ago|reply

They could take a 20 year old that has never played fps's and perform similar tests. Coordination, twitch muscle reflexes, peripheral vision and other variables will never exist with AI. GLHF as they say.

Hopefully not too off-topic, but AI playing chess, doing this or that "better than humans" is not a 1st world problem.

I never followed the chess scene and was intrigued with the Han Niemann angle, but noticed the chess engines being overblown with a lot of it (play ping pong against a wall type of stuff).

The 1st thing I thought was a chess format called "turn the tables" where chess players perform the 1st 3-5 moves, then "turn the board" and compete with the oposing players setup (did not research and this may already exist).

A Deep Blue versus Garry Kasparov rematch using "turn the tables" gameplay would be cool.

I did laugh at Magnus Carlsen winning in 1 move when the opponent resigned after a crappy knight move by Magnus at the recent tournament. TtT chess takes a lot of the memorization strategies out of play for the natural GM's.

With much of this AI stuff, nobody is chasing AI because of its severe limitations, impediments and attribute deficiencies. Just play ping-pong against a wall.

39 comments