top | item 44225003

(no title)

SEGyges | 8 months ago

Every LLM provider caches their KV-cache, it's a publicly documented technique (go stuff that KV in redis after each request, basically) and a good engineering team could set it up in a month.

discuss

chipsrafferty|8 months ago

Are you saying if I ask a prompt "foo" and then a month later another user asks "foo" then it retrieves a cached value?

wkat4242|8 months ago

No, the key value cache is the context in a way the model can read it.