Llama.cpp is an inference engine. The author of llama.cpp designed gguf. Funcionary is a model that does function calling. You can download functionary weights in the gguf format and then run it using llama.cpp on low-end machines using CPU or GPU or a mix of both.
thund|2 years ago
behnamoh|2 years ago