top | item 46323791

(no title)

Amazing article. I was under the misapprehension that temp and other output parameters actually do affect caching. Turns out I was wrong and this explains why beautifully.

Great work. Learned a lot!

discuss

samwho|2 months ago

Yay, glad I could help! The sampling process is so interesting on its own that I really want to do a piece on it as well.

wesammikhail|2 months ago

Looking forward to it!

stingraycharles|2 months ago

I had a “somebody is wrong on the internet!!” discussion about exactly this a few weeks ago, and they proclaimed to be a professor in AI.

Where do people get the idea from that temperature affects caching in any way? Temperature is about next token prediction / output, not input.

wesammikhail|2 months ago

Because in my mind, as a person not working directly on this kind of stuff, I figured that caching was done similar to any resource caching in a webserver environment.

It´s a semantics issue where the word caching is overloaded depending on context. For people that are not familiar with the inner workings of llm models, this can cause understandable confusion.

semi-extrinsic|2 months ago

Being wrong about details like this is exactly what I would expect from a professor. They are mainly grant writers and PhD herders, often they are good at presenting as well, but they mostly only have gut feelings about technical details of stuff invented after they became a professor.