Collecting data is hard, but the library is also a synthetic data generation library, so for example you can create the data for DPO fully synthetically, check out the self-rewarding LLMs example:
https://datadreamer.dev/docs/latest/pages/get_started/quick_...
No comments yet.