(no title)
donadigo | 1 year ago
Just wanted to provide some perspective here on how many things those projects need to take care of in order to get some training setup going.
I'm the developer behind TMInterface [1] mentioned in this post, which is a TAS tool for the older TrackMania game (Nations Forever). For Linesight (last project in this post), I recently ended up working with its developers to provide them the APIs they need to access from the game. There's a lot of things RL projects usually want to do: speed up the game (one of the most important), deterministically control the vehicle, get simulation information, navigate menus, skip cut scenes, make save states, capture screenshots etc. Having each of those things implemented natively greatly impacts the stability and performance of training/inference in a RL agent, e.g. for the latest version the project uses a direct capture of the surface that's rendered to the game window, instead of using an external Python library (DxCam). This is faster, doesn't require any additional setup and also allows for training even if the game window is completely occluded by other windows.
There are also many other smaller annoying things: many games throttle FPS if the window is unfocused which is also the case here, and the tool patches out this behaviour for the project, and there's a lot more things like this. The newest release of Linesight V3 [2] can reliably approach world records and it's being trained & experimented with by quite a few people. The developers made it easy to setup and documented a lot of the process [3].
[1] https://donadigo.com/tminterface/
Daneel_|1 year ago
brutus1213|1 year ago
I had a few brushes in RL (with collaborators who knew more RL than I did). A key issue we encountered in different problem settings was the number of samples required to train. We created a headless version of the underlying environment but could not make it go a lot faster than real-time. We also did some work to parallelize but it wasn't enough (and it was expensive). Is the TM related RL training happening in real-time or is it possible to speed it up? That seemed like the key problem to make RL widely used, but curious about your thoughts.
donadigo|1 year ago
We're lucky in case of TrackMania because it internally has systems to both set the relative game speed and also completely disable all rendering and just run physics. Linesight achieves about ~10x speedup where the most time spent now is in rendering game frames and running the inference on the network. They also parallelize training by running more game instances and implementing a training queue. For the "raw" speedup ratios, TM usually achieves about ~60x (one minute is simulated in one second) and I use this speedup to implement bruteforce functionality in the tool (coupled with a custom save states implementation).
msephton|1 year ago