(no title)
orpheansodality | 2 years ago
Part of its contents come from the "USPTO Backgrounds" dataset. From The Pile's paper:
> USPTO Backgrounds is a dataset of background sections from patents granted by the United States Patent and Trademark Office, derived from its published bulk archives. A typical patent background lays out the general context of the invention, gives an overview of the technical field, and sets up the framing of the problem space. We included USPTO Backgrounds because it contains a large volume of technical writing on applied subjects, aimed at a non-technical audience.
More details in the paper: https://arxiv.org/pdf/2101.00027.pdf
The Pile: https://pile.eleuther.ai/
No comments yet.