top | item 39453606

(no title)

nalzok | 2 years ago

Congratulations on the release! How can we download the model and run inference locally?

discuss

You can download the model checkpoints from kaggle https://www.kaggle.com/models/google/gemma and huggingface https://huggingface.co/blog/gemma

Besides the python implementations, we also implemented a standalone C++ implementation that runs locally with just CPU simd https://github.com/google/gemma.cpp

tveita|2 years ago

Are there any cool highlights you can give us about gemma.cpp? Does it have any technical advantages over llama.cpp? It looks like it introduces its own quantization format, is there a speed or accuracy gain over llama.cpp's 8-bit quantization?

kathleenfromgdm|2 years ago

Thank you! You can get started downloading the model and running inference on Kaggle: https://www.kaggle.com/models/google/gemma ; for a full list of ways to interact with the model, you can check out https://ai.google.dev/gemma.

aphit|2 years ago

FYI the ; broke the link, but I found it easily anyway.