top | item 45429561

(no title)

ImJasonH | 5 months ago

Is anybody working on making building specialized things easier and cheaper?

discuss

-_-|5 months ago

Yes! At https://RunRL.com we offer hosted RL fine-tuning, so all you need to provide is a dataset and reward function or environment.

selim-now|5 months ago

yes! check out https://distillabs.ai/ – follows a similar approach except the evaluation set is held out before the synthetic data generation, which I would argue makes it more robust (I'm affiliated)