Everybody who has released a decent model has used a shadow library dump for training. Are they going to admit that they illegally downloaded more than 20 million books?
I heard a really interesting take somewhere recently (I believe it was on the verge's decoder podcast which is excellent), that a lot of the people in these organizations think this issue is a money problem and not a "fuck you fuck off forever keep your fucking hands off my data" problem, which could actually derail the entire genai industry as they run into more legal hurdles.
belter|1 year ago
__loam|1 year ago