top | item 42548883

Ask HN: How does Apple Intelligence optimize KV cache?

2 points| Santisco | 1 year ago

Hi folks,

Is there any public information on how Apple Intelligence optimizes the KV cache? For instance, in their cloud PCC, they claim to use stateless computing, where the context is sent to the cloud each time. Given that the context can be lengthy and the KV cache quite large, what magic might Apple use to handle this?

discuss

order

No comments yet.