The approach I used was similar. The idea of maximising observed control of the world means you seek states where you can reach many other states, but _predictably_ so. This comes 'for free' when using Information Theory to model a channel.
Do you have any reading you'd recommend related to this?
I naively thought it would be some kind of Kalman filtering of sorts but from what I gather in your words it doesn't even have to be "that" complicated, right?
What's the tradeoff between "delete all state in the world with 100% certainty" and "be able to choose any next state of the world with (100-epsilon)% certainty"?
In Information Theory, there is a concept of Channel Capacity. If a channel is defined as the probability of the output being s if you send a, across all possible values of a, then the Channel Capacity is the maximum amount of information you can communicate across this channel, measured in bits.
To achieve the Channel Capacity you need to find the optimum distribution across a - i.e. what set of signals maximises the information you can transmit on this channel. There are known algorithms for finding this distribution (e.g. Blahut-Arimoto).
Now if you model the world as a channel, where s represents the reachable states and a represents the actions the agent can take (and the channel, P(s|a), represents the dynamics of the world), you can calculate what actions allow you maximal control (in terms of states you can controllably reach).
cmehdy|4 years ago
I naively thought it would be some kind of Kalman filtering of sorts but from what I gather in your words it doesn't even have to be "that" complicated, right?
edit: found your link to the paper in another post ( https://news.ycombinator.com/item?id=27749619 ), thanks!
benlivengood|4 years ago
TomAnthony|4 years ago
To achieve the Channel Capacity you need to find the optimum distribution across a - i.e. what set of signals maximises the information you can transmit on this channel. There are known algorithms for finding this distribution (e.g. Blahut-Arimoto).
Now if you model the world as a channel, where s represents the reachable states and a represents the actions the agent can take (and the channel, P(s|a), represents the dynamics of the world), you can calculate what actions allow you maximal control (in terms of states you can controllably reach).
More info in this paper: https://uhra.herts.ac.uk/handle/2299/15376