top | item 45947821 Super human Stratego with RL and test time search 2 points| algo_trader | 3 months ago |arxiv.org 1 comment order hn newest algo_trader|3 months ago Only 2000 GPU hours Heavily customized network 95% win rate in recent human tournament sample Several training techniques for evaluation/learning rate
algo_trader|3 months ago Only 2000 GPU hours Heavily customized network 95% win rate in recent human tournament sample Several training techniques for evaluation/learning rate
algo_trader|3 months ago