(no title)
wdpk
|
2 years ago
even if true which it does not seem to be the case, the whole thing sounds pretty marginal, in order to train a model that is most likely significantly bigger than 100b parameters, one also needs orders of magnitude more training data than the small 120k chat that were shared on the ShareGPT website
halfeatenscone|2 years ago
wdpk|2 years ago
"The cat is finally out of the bag – Google relied heavily on @ShareGPT 's data when training Bard.
This was also why we took down ShareGPT's Explore page – which has over 112K shared conversations – last week.
Insanity."
Fine-tunning is not exactly the same as "relying heavily", I bet they got way more fine-tunning data from simply asking their 100k employees to pre-beta test for a couple of months