StableLM Zephyr 3B

simonw|2 years ago

"This model is being released under a non-commercial license that permits non-commercial use."

I'm very interested in high quality 3B models, but it's hard to get excited about this given the increasing array of commercially usable models.

anigbrowl|2 years ago

They have more fully open stuff in the pipeline. IMHO it's good that they put out stuff for hobbyists to play around with so that they're not immediately overtaken by people ready to deploy things at commercial scale.

emadm|2 years ago

It will be included under our membership next week which starts at $1 a month after grant ($20 base)

unknown|2 years ago

[deleted]

brianjking|2 years ago

Yeah, Replit is likely the best option out there for a 3B model size, right?

supermatt|2 years ago

This space is so confusing when it comes to licenses.

“Zephyr” is MIT, but “Stability Zephyr” is non commercial. They could have at least used a different name.

“Inspired” in all but license it would seem

filterfiber|2 years ago

> Hardware: StableLM Zephyr 3B was trained on the Stability AI cluster across 8 nodes with 8 A100 80GBs GPUs for each nodes.

I might be missing it but do they say the number of training tokens that was used to train this?

This would help with efforts like TinyLlama in trying to figure out how well the scaling works with training tokens vs parameter size and challenging the chinchilla model.

emadm|2 years ago

We included full training details for the base model on 4 trillion tokens including wandb etc

https://stability.wandb.io/stability-llm/stable-lm/reports/S...

mirekrusin|2 years ago

How those licenses work for generated content? If it's non-commercial does it mean I can still use it for work to generate stuff? In other words - is it similar to ie. using GIMP, which is open source, but I can still use created content in commercial product without attribution?

josh-sematic|2 years ago

Note that it uses a non-commercial license. Still pretty cool though!

Reubend|2 years ago

Yeah, I think this is a great release, but I also suspect that most people won't end up using it just because of the license. It's actually a lot more restrictive than what I would personally consider "commercial" usage:

> Non-Commercial Uses does not include any production use of the Software Products or any Derivative Works.

So even if you want to launch a free service using this, that's not allowed.

stavros|2 years ago

Am I reading it right that performance was roughly comparable with GPT-3.5? How is this even possible?

Version467|2 years ago

Not really. They already chose to show the benchmark where it does best and even then it’s still quite a bit worse (though definitely impressive for its size). If you take a look at other benchmarks, for example MMLU@5-shot then this does 46.3, while gpt-3.5 does 70.

But there might be some use cases where this one is close enough in performance and the difference in cost and speed make it a better choice.

filterfiber|2 years ago

No it's not (according to their benchmarks).

Zephyr-7B-B still beats it in most benchmarks but it's close.

This model is almost Zephyr-7B-B performance at 3B size which is a lot better for inference requirements.

alsodumb|2 years ago

By comparing on benchmarks that are either limited, or have data leaks, or in most cases just don't make sense in terms of usability - I've personally stopped looking at benchmarks to compare models. Personally, if I want to try a new model I hear a lot of chatter about, I use it for a few hours in my daily workflow. My baseline is GPT3.5 and GPT4, and I compare the models with them in terms of my day to day usage.

3abiton|2 years ago

The LLM field is still messy at large, if you look at the rankings of model performance, they still do not reflect their usability in real life. I think one major challenge is to find a corresponding benchmark.

adamkochanowicz|2 years ago

Can't wait for someone smarter than me to make this compatible with MLC on iPhone.

haltist|2 years ago

You can do this by just following a tutorial: https://huggingface.co/docs/diffusers/main/en/using-diffuser.... ML/AI models are just function graphs and most of the frameworks support saving and loading safetensor serialized graphs.

nextworddev|2 years ago

Stability is apparently up for sale, hence the recent steady stream of releases

simlevesque|2 years ago

How fast are these small models on a 4090, is it like 100ms ? 500ms ?

pulse7|2 years ago

Mistral-7B gives you 80 tokens/second on 4090. So this one will be faster...

unknown|2 years ago

[deleted]

m3kw9|2 years ago

How would one go about making a .llamafile for this?

simonw|2 years ago

Convert it to GGML and use a zip tool to add that to a llamafile package.

https://huggingface.co/TheBloke?search_models=Zephyr doesn't have a GGML for it yet but I wouldn't be surprised to see one by the end of the day.

38 comments