top | item 46560896

(no title)

01092026 | 1 month ago

Well, Mac chips are badass for training / inference - super underrated. I mean, I've literally run epochs on cloud Nvidia GPU Servers...compared to running them locally (M chip) - and look, not trying to burn any houses down but...eh...Apple does really really well.

The good news for you, you can chain like a bunch / couple of them together and run the largest open source models around. But extremely expensive route - but probably the easiest and smoothest way.

If you're planning on running this on Apple - you can do some stuff with Metal directly...in PyTorch it's 'mcu' if I remember?

I think your llama.cpp route is good - I wouldn't go the Ollama route - I mean great to start, but IMHO: get the models directly, learn the layers and how the heads work as best as you can, make an effort to understand what's going on - well you don't have to, but, I think the models appreciate the effort - respect goes far.

discuss

No comments yet.