A future where we carry and manage just one device could be incredible. That said, today, even if iOS weren’t so locked down and more capable of that, I think I’d find myself frustrated. I run on device local llm’s on my iPhone and a heavily quantized 3b parameter model starts to cause the iPhones thermal management to heavily throttle after just a few prompts with light tokens, to the point it’s slower than 1 token per second for inference or response, and the phone gets hot to the touch. Maybe the rumored half iPhone half iPad device could be the eventual platform from which something like this emerges.
mark_l_watson|3 months ago
So, I feel like I routinely experience what we are talking about in this sub-thread. Given a few VPS’s to ssh/mosh into for programming and a keyboard and mouse, this is a workable setup.
The one thing that always gets me to unpack my MacMini and set it up is that even with 16G shared memory on a iPadPro, I can only run local models in a chat-style app. On macOS, my LLM use is mostly embedded in experimental scripts and apps.
WorldPeas|3 months ago
WorldPeas|3 months ago