top | item 41583572

(no title)

citboin | 1 year ago

>The main thing you can do is support companies and groups who are releasing open source models. They are usually using their own data.

Alternatively we could create standardized open source training data like wikipedia, wikimedia as well as public domain literature and open courseware. I'm sure that there are many other such free and legal sources of data.

discuss

order

KaiserPro|1 year ago

but the training data is one of the key bits that makes or breaks your model's performance.

There is a reason why datasets are private and the model weights aren't.