top | item 37843955

(no title)

benxh | 2 years ago

It's missing a lot of crucial details. Nothing on the dataset used, nothing on the data mix, nothing on their data cleaning procedures, nothing on the tokens trained.

discuss

dazed_confused|2 years ago

What we get when it is on arxiv first before being peer reviewed.

arugulum|2 years ago

BERT was on arXiv before being peer reviewed. As were T5, BART, LLaMA, OPT and GPT-NeoX-20B. The Pile and FLAN were also on arXiv before being peer reviewed. Of course, the original Transformer paper was also on arXiv before being peer reviewed.

Being on arXiv before being peer reviewed is not the or even a problem.

jmac01|2 years ago

I cud almost tell this would be the case when the title of the paper was simply Mistral 7B. A little more info would be useful!