top | item 45429561 (no title) ImJasonH | 5 months ago Is anybody working on making building specialized things easier and cheaper? discuss order hn newest -_-|5 months ago Yes! At https://RunRL.com we offer hosted RL fine-tuning, so all you need to provide is a dataset and reward function or environment. selim-now|5 months ago yes! check out https://distillabs.ai/ – follows a similar approach except the evaluation set is held out before the synthetic data generation, which I would argue makes it more robust (I'm affiliated)
-_-|5 months ago Yes! At https://RunRL.com we offer hosted RL fine-tuning, so all you need to provide is a dataset and reward function or environment.
selim-now|5 months ago yes! check out https://distillabs.ai/ – follows a similar approach except the evaluation set is held out before the synthetic data generation, which I would argue makes it more robust (I'm affiliated)
-_-|5 months ago
selim-now|5 months ago