(no title)
car | 9 days ago
Since I don't see it mentioned here, LlamaBarn is an awesome little—but mighty—MacOS menubar program, making access to llama.cpp's great web UI and downloading of tastefully curated models easy as pie. It automatically determines the available model- and context-sizes based on available RAM.
https://github.com/ggml-org/LlamaBarn
Downloaded models live in:
~/.llamabarn
Apart from running on localhost, the server address and port can be set via CLI: # bind to all interfaces (0.0.0.0)
defaults write app.llamabarn.LlamaBarn exposeToNetwork -bool YES
# or bind to a specific IP (e.g., for Tailscale)
defaults write app.llamabarn.LlamaBarn exposeToNetwork -string "100.x.x.x"
# disable (default)
defaults delete app.llamabarn.LlamaBarn exposeToNetwork
noisy_boy|8 days ago
car|8 days ago
As for models, plenty of GGUF quantized (down to 2-bit) available on HF and modelscope.