top | item 35611506

(no title)

ankraft | 2 years ago

That sounds interesting. How can I obtain this Vicuna model that works with llama.cpp?

discuss

kgeist|2 years ago

https://medium.com/@martin-thissen/vicuna-on-your-cpu-gpu-be...

See the section "CPU Installation (GGML Quantised)"

You need Python to download the model from HuggingFace using the official API. After that, all you need is the binary file with weights and a compiled binary of llama.cpp

P.S. The author seems to have renamed their repo to "eachadea/legacy-vicuna-13b" on HuggingFace