I’m curious to hear more about phone-local assistants. I rather assumed only the latest hardware ( iPhone 15+, not sure on Android side) could do local inference. Is there a way to get something going on hardware a couple years old?
> Is there a way to get something going on hardware a couple years old?
Tensor accelerators are very recent thing, and GPU/WebGPU also recent.
RAM was also limited, 4Gb was long time barrier.
So, model should run on CPU and within 4Gb or even 2Gb.
Oh, I forget one important thing - couple years old mobile CPUs was also weak (and btw exception was iphone/ipad).
But, if you have gaming mobile (or iphone), which at that time was comparable to Notebooks, may run something like Llama-2 quantized to 1.8Gb at about 2 tokens per second, not very impressive, but could work.
Unfortunately, I could not remember, when median performance of mobile CPU become comparable to business Notebooks.
I think, Apple entered race for speed with iPhone X and iPad 3. For Androids things even worse, looks like median achieved Notebooks speed at about Qualcomm snapdragon 6xx.
FUTO voice typing runs local on my galaxy 20, so, yes. Also there are SPA that claim to load local that I have but I haven't tried that. There are small models, one I know of is 380M parameter, rather than 15B or 800B...
simne|11 months ago
Tensor accelerators are very recent thing, and GPU/WebGPU also recent. RAM was also limited, 4Gb was long time barrier.
So, model should run on CPU and within 4Gb or even 2Gb.
Oh, I forget one important thing - couple years old mobile CPUs was also weak (and btw exception was iphone/ipad).
But, if you have gaming mobile (or iphone), which at that time was comparable to Notebooks, may run something like Llama-2 quantized to 1.8Gb at about 2 tokens per second, not very impressive, but could work.
simne|11 months ago
I think, Apple entered race for speed with iPhone X and iPad 3. For Androids things even worse, looks like median achieved Notebooks speed at about Qualcomm snapdragon 6xx.
genewitch|11 months ago