we were thinking about doing exactly this, the closest current work is probably the amazing "Learning Formal Mathematics from Intrinsic Motivation" by Poesia et al (they use constraints too increase the likelihood of generating correct theorems/proofs during RL)https://arxiv.org/abs/2407.00695
informal007|9 months ago
imtringued|9 months ago