top | item 47160547

(no title)

nee1r | 4 days ago

yeah! i love the BCO paper, i think its extremely intuitive and these methods are really interesting in a time where data without labels is abundant. i especially like the idea of iteratively making the inverse dynamics better—might lean closer to that in the future

discuss

order

cs702|3 days ago

> i especially like the idea of iteratively making the inverse dynamics better

Same here.

The notion of inducing these models to "hypothesize" distributions over possible actions given subsequent observed transitions makes me think of "contrastive divergence," the method Hinton and others came up with for unsupervised training of Restricted Boltzmann Machines (RBMs), in the prehistoric era of deep learning.

Given each training sample, an RBM would 1) execute a forward pass, 2) sample its output units, 3) "hypothesize" its input units, 4) execute another forward pass on the "hypothesized" input units to sample new output units, and (5) compute a type of contrastive error for local backpropagation. RMBs could be stacked, with output units from one becoming input units for the next one. Hinton called the input units "visible," and the output ones "hidden."

It's not the same, obviously, but the idea of modeling machine-generated inputs (or actions) given outputs (or transitions) has always been appealing. It has a long history.