top | item 38713196 (no title) s-casci | 2 years ago The policy function outputs the probability of taking every possible (legal or illegal) action. Once you have a way of indexing those actions, both the policy and the game need to refer to the same thing when indexing the same number discuss order hn newest No comments yet.
No comments yet.