There's no learning exactly, as the post explains the fuzzer is aware of various RAM addresses (as well as having a tactic for how it "presses" buttons in the game). It's just trying to explore the space of Mario's level + his x and y coordinates.(I'm an Antithesis employee.)
nextaccountic|1 year ago
Or rather, if there were tons of irrelevant state, it could always end up trapped somewhere and never actually complete a level even after centuries of fuzzing.
Something similar was tested in the Twitch Plays Pokemon [0] gaming experiment, but there the inputs appeared random but weren't actually random: there were "factions" that either tried to sabotage the run, or that tried to make it progress. Ultimately the majority of the players were cooperating to complete the game and this was a deciding factor to make the run succeed. Maybe fuzzing Pokemon can't complete the game, the way that TPP could (or reinforcement learning could).
[0] https://en.wikipedia.org/wiki/Twitch_Plays_Pok%C3%A9mon
vojev|1 year ago
I think Pokemon could be beaten with our techniques. Final Fantasy on NES poses similar problems to Pokemon, and that is a game at which some progress has been made in the past, here.