top | item 38465815

(no title)

Hedepig | 2 years ago

I am currently tinkering with this all, you can download a 3b parameter model and run it on your phone. Of course it isn't that great, but I had a 3b param model[1] on my potato computer (a mid ryzen cpu with onboard graphics) that does surprisingly well on benchmarks and my experience has been pretty good with it.

Of course, more interesting things happen when you get to 32b and the 70b param models, which will require high end chips like 3090s.

[1] https://huggingface.co/TheBloke/rocket-3B-GGUF

discuss

jart|2 years ago

That's a nice model that fits comfortably on Raspberry Pi. It's also only a few days old! I've just finished cherry-picking the StableLM support from the llama.cpp project upstream that you'll need in order to run these weights using llamafile. Enjoy! https://github.com/Mozilla-Ocho/llamafile/commit/865462fc465...

Hedepig|2 years ago

Thank you for this :)