It seems like this is an orchestration layer that runs on Apple Silicon, given that ChatGPT integration looks like an API call from that. It's not clear to me what is being computed on the "private cloud compute"?
If I understand correctly there's three things here:
- on-device models, which will power any tasks it's able to, including summarisation and conversation with Siri
- private compute models (still controlled by apple), for when it wants to do something bigger, that requires more compute
- external LLM APIs (only chatgpt for now), for when the above decide that it would be better for the given prompt, but always asks the user for confirmation
The second point makes sense. It gives Apple optionality to cut off the external LLMs at a later date if they want to. I wonder what % of requests will be handled by the private cloud models vs. local. I would imagine TTS and ASR is local for latency reasons. Natural language classifiers would certainly run on-device. I wonder if summarization and rewriting will though - those are more complex and definitely benefit from larger models.
cube2222|1 year ago
- on-device models, which will power any tasks it's able to, including summarisation and conversation with Siri
- private compute models (still controlled by apple), for when it wants to do something bigger, that requires more compute
- external LLM APIs (only chatgpt for now), for when the above decide that it would be better for the given prompt, but always asks the user for confirmation
localhost|1 year ago