(no title)
manux | 4 years ago
- RL says, give me a reward and I'll give you its max.
- GFlowNet says, give me a reward and I'll give you all its modes (via p(x) \propto R(x)).
Yes you would ideally have a loss (well, a reward/energy) that is invariant and operates e.g. directly on the molecule rather than on some arbitrary ordering of the nodes.
No comments yet.