top | item 34242104

(no title)

cweill | 3 years ago

I also have this question. Is the RL MDP actually encoding cause and effect? Or just learning (bidirectional) correlations between states and actions?

I wonder if Pearl thinks that RL replicates his do-calculus under the hood, or if that's an innovation we're missing.

discuss

order

No comments yet.