top | item 33927432

(no title)

cottonseed | 3 years ago

Yes. EleutherAI is doing it, probably one of many:

https://www.eleuther.ai/projects/gpt-neox/ https://github.com/EleutherAI/gpt-neox https://arxiv.org/abs/2204.06745

They have a 20B parameter model. I think the primary dataset for these open models is The Pile: https://arxiv.org/abs/2101.00027 (web scrape, pubmed, arxiv, github, wikipedia, etc. There is a nice diagram on page 2 that summarizes the contents.)

discuss

order

sharemywin|3 years ago

from what I gather the pile is only a first step. it would require more steps. task oriented chats. as well as building something that can rate answers.