top | item 47127735

(no title)

Aedelon | 7 days ago

Yeah, that HF dataset page is rough. 247+ threads, mostly DMCA reports, archive-locked fics scraped without consent, dataset reuploaded after takedown. The AO3 community had every reason to be furious.

Not RWKV-specific though. Most large corpora have the same sources in them, they just don't list them explicitly. Whether the transparency makes it better or worse is a real question.

discuss

order

No comments yet.