(no title)
patelajay285 | 2 years ago
However, there is a lot of documentation on the site to help guide users. This documentation page shows you can load in data via local datasets as well. For example, JSON, CSV, text files, a local HF Dataset folder, or even from a Python `dict` or `list`:
https://datadreamer.dev/docs/latest/datadreamer.steps.html#t...
We'll definitely keep improving documentation, guides, and examples. We have a lot of it already, and more to come! This has only recently become a public project :)
If anyone has any questions on using it, feel free to email me directly (email on the site and HN bio) for help in the meantime.
mk_stjames|2 years ago
I would not have guessed that the base input data processing would have been filed under 'steps'. But now I kinda see how you are working, but I admit I'm not the target audience.
If you want this to really take off for people outside of a very, very specific class of researchers... setup an example on your landing page that calls to a local JSON of user prompts/answers/rejects finetuning a llama model with your datadreamer.steps.JSONDataSource into the loader. Or, a txt file with the system/user/assistant prompts tagged and examples given. Yes, your 'lines of code' for your frontpage example may grow a bit!
Maybe there are a lot of 'ML researchers' that are used to the type of super-abstract OOP API, load-it-from-huggingface-scheme-people you are targeting but also know that there are a ton that aren't.
patelajay285|2 years ago