(no title)
tkp-415 | 11 days ago
Is my only option to invest in a system with more computing power? These local models look great, especially something like https://huggingface.co/AlicanKiraz0/Cybersecurity-BaronLLM_O... for assisting in penetration testing.
I've experimented with a variety of configurations on my local system, but in the end it turns into a make shift heater.
0xbadcafebee|11 days ago
For your Mac, you can use Ollama, or MLX (Mac ARM specific, requires different engine and different model disk format, but is faster). Ramalama may help fix bugs or ease the process w/MLX. Use either Docker Desktop or Colima for the VM + Docker.
For today's coding & reasoning models, you need a minimum of 32GB VRAM combined (graphics + system), the more in GPU the better. Copying memory between CPU and GPU is too slow so the model needs to "live" in GPU space. If it can't fit all in GPU space, your CPU has to work hard, and you get a space heater. That Mac M1 will do 5-10 tokens/s with 8GB (and CPU on full blast), or 50 token/s with 32GB RAM (CPU idling). And now you know why there's a RAM shortage.
BoredomIsFun|10 days ago
Is hopelessly dated. There are much better newer models around.
mft_|11 days ago
I picked up a second-hand 64GB M1 Max MacBook Pro a while back for not too much money for such experimentation. It’s sufficiently fast at running any LLM models that it can fit in memory, but the gap between those models and Claude is considerable. However, this might be a path for you? It can also run all manner of diffusion models, but there the performance suffers (vs. an older discrete GPU) and you’re waiting sometimes many minutes for an edit or an image.
ryandrake|11 days ago
sigbottle|11 days ago
zozbot234|11 days ago
zargon|10 days ago
ontouchstart|11 days ago
https://github.com/ggml-org/llama.cpp/discussions/15396
xrd|11 days ago
https://www.reddit.com/r/LocalLLM/
Everytime I ask the same thing here, people point me there.
yjftsjthsd-h|11 days ago
HanClinto|11 days ago
https://www.docker.com/blog/run-llms-locally/
As far as how to find good models to run locally, I found this site recently, and I liked the data it provides:
https://localclaw.io/
Hamuko|11 days ago