top | item 47097816

(no title)

car | 9 days ago

So great to see my two favorite Open Source AI projects/companies joining forces.

Since I don't see it mentioned here, LlamaBarn is an awesome little—but mighty—MacOS menubar program, making access to llama.cpp's great web UI and downloading of tastefully curated models easy as pie. It automatically determines the available model- and context-sizes based on available RAM.

https://github.com/ggml-org/LlamaBarn

Downloaded models live in:

  ~/.llamabarn

Apart from running on localhost, the server address and port can be set via CLI:

  # bind to all interfaces (0.0.0.0)
  defaults write app.llamabarn.LlamaBarn exposeToNetwork -bool YES

  # or bind to a specific IP (e.g., for Tailscale)
  defaults write app.llamabarn.LlamaBarn exposeToNetwork -string "100.x.x.x"

  # disable (default)
  defaults delete app.llamabarn.LlamaBarn exposeToNetwork

discuss

noisy_boy|8 days ago

Github is showing me unicorn - is there an Linux equivalent? I have a old Thinkpad with a puny Nvidia GPU, can I hope to find anything useful to run on that?

car|8 days ago

Building Llama.cpp from source with CUDA enabled should get you pretty far. llama-server has a really good web UI, the latest version supports model switching.

As for models, plenty of GGUF quantized (down to 2-bit) available on HF and modelscope.