top | item 38159927

01-AI/Yi: A series of large language models trained from scratch

143 points| simonpure | 2 years ago |github.com | reply

52 comments

order
[+] popinman322|2 years ago|reply
When I saw "from scratch" instead of "pretrained" I assumed this was trained using some novel RL setup-- there's value to using the same verbiage as everyone else.
[+] zapnuk|2 years ago|reply
These pedantic “when I saw X instead of Y I assumed Z” are the most annoying comments on this website.

“From scratch” might be non-specific, but a fine phrasing in this case.

[+] GaggiX|2 years ago|reply
"Trained from scratch" is perfectly fine terminology to report that the model they published is not a finetuned, but was trained from randomly initialized weights.
[+] Mougatine|2 years ago|reply
"From scratch" is commonly used in the field.
[+] synarchefriend|2 years ago|reply
They probably want to emphasize that it's not another llama derivative.
[+] seydor|2 years ago|reply
I would have liked to see how it compares with Mistral
[+] logicchains|2 years ago|reply
China's catching up fast in the open source model space, I wonder how long it'll take until they have a commercial model competitive with ChatGPT3.5 or Claude 2?
[+] antupis|2 years ago|reply
Who cares if it is open source model and weights are available. Nightmare scenario is that AI is behind some paywall and some entity can decide what goes in and what goes out.
[+] zone411|2 years ago|reply
They do already. Ernie 4.0.
[+] ilaksh|2 years ago|reply
Looks amazing. Too bad they don't allow commercial use of the model (without a license agreement).
[+] gs17|2 years ago|reply
Non-commercial use is also limited.

> Your use of the Yi Series Models must comply with the Laws and Regulations as well as applicable legal requirements of other countries/regions, and respect social ethics and moral standards

("Laws and Regulations" is specifically mainland China's)

[+] dvh|2 years ago|reply
I downloaded the repository and it is 700kB (1900 LOC). It clearly doesn't contains what it claims to have. How is this considered "open source"?
[+] knapcio|2 years ago|reply
Please read the project description (Point 2 - 2. Download the model).
[+] brucethemoose2|2 years ago|reply
I'd posit you didn't install git-lfs.

You probably want to download a quant instead anyway, even if you are on a very fast PC.

[+] anon23432343|2 years ago|reply
Another hour another new AI model.

Which can't solve bubble sort correct and will output you a bad performing version of it.

AI is the future.

[+] esafak|2 years ago|reply
It's a language model. You don't expect your car to fly.
[+] postalrat|2 years ago|reply
I'm a programmer and I couldn't tell you how I could "solve" a bubble sort.