WingNews logo WingNews
top | new | best | ask | show | jobs
top | item 23894653

(no title)

stephenroller | 5 years ago

The dataset can be obtained around the web. It's mostly CommonCrawl, Reddit, Toronto Book Corpus, and Wikipedia.

You can find a very comparable corpus open sourced and easy to use on the [T5 repo](https://github.com/google-research/text-to-text-transfer-tra...)

discuss

order

No comments yet.

powered by hn/api // news.ycombinator.com