I'd be completely flabbergasted if they trained an AI to parse source code while playing a game.
I think it's the "lack of addition" rather than the "expressly disallowed" training rules that are being referred to.
They wouldn't necessarily need to parse the source code. E.g. initially training a neural network on replays of human runs (AlphaGo style, as opposed to AlphaZero) would probably be considered "spoiling" in the Nethack community, but it's a plausible approach assuming you can somehow obtain a training dataset.
The AI doesn't need to parse the source code. I haven't played Nethack, but based on the example given of writing ELBERETH in the dirt, the person training the model could add that to the potential actions being considered at each step and the model learns what it does by playing a million games.
amalcon|3 years ago
cool_dude85|3 years ago
pmontra|3 years ago