top | item 41546377 (no title) jcmeyrignac | 1 year ago Any plan for languages other than english? This would be a perfect tool for french language. discuss order hn newest deepsquirrelnet|1 year ago It’s certainly feasible. I’d need to put together a corpus for training and I’m not terribly familiar with what’s available for French language.I have done some training with the Mistral family of models, and that’s probably what I’d think to try first on a French corpus.Feel free to open an issue and I’ll work on it as I find time. Ey7NFZ3P0nzAe|1 year ago Very interested in a multilingual version too!FYI huggingface hosts datasets too. And wikipedia has a nice portal for datasets : https://en.m.wikipedia.org/wiki/List_of_datasets_for_machine...
deepsquirrelnet|1 year ago It’s certainly feasible. I’d need to put together a corpus for training and I’m not terribly familiar with what’s available for French language.I have done some training with the Mistral family of models, and that’s probably what I’d think to try first on a French corpus.Feel free to open an issue and I’ll work on it as I find time. Ey7NFZ3P0nzAe|1 year ago Very interested in a multilingual version too!FYI huggingface hosts datasets too. And wikipedia has a nice portal for datasets : https://en.m.wikipedia.org/wiki/List_of_datasets_for_machine...
Ey7NFZ3P0nzAe|1 year ago Very interested in a multilingual version too!FYI huggingface hosts datasets too. And wikipedia has a nice portal for datasets : https://en.m.wikipedia.org/wiki/List_of_datasets_for_machine...
deepsquirrelnet|1 year ago
I have done some training with the Mistral family of models, and that’s probably what I’d think to try first on a French corpus.
Feel free to open an issue and I’ll work on it as I find time.
Ey7NFZ3P0nzAe|1 year ago
FYI huggingface hosts datasets too. And wikipedia has a nice portal for datasets : https://en.m.wikipedia.org/wiki/List_of_datasets_for_machine...