(no title)
randomgermanguy | 3 months ago
That's what I tried to explain then as well, and i brought up stuff like path-finding algorithms for route-finding (A*/heuristic-search) as an more old-school AI part, which didn't really land I think.
> Not really stochastic as far as I know. The whole random seed and temperature thing is a bit of a grey area for my full understanding. Let alone the topk, top p, etc. I often just accept what's recommended from the model folks.
I mean LLMs are often treated in stochastic nature, but like ML models aren't usually? Like maybe you have some dropout, but that's usually left out during inference AFAIK. I dont think a Resnet or YOLO is very stochastic, but maybe someone can correct me.
> AI for the most part has been out a couple years.
With this you just mean LLMs right? Because I understand AI to be way more then just LLMs & ML
sohojoe|3 months ago
so the order in which floating-point additions happen is not fixed because of how threads are scheduled, how reductions are structured (tree reduction vs warp shuffle vs block reduction)
Floating-point addition is not associative (because of rounding), so: - (a + b) + c can differ slightly from a + (b + c). - Different execution orders → slightly different results → tiny changes in logits → occasionally different argmax token.
rolisz|3 months ago
randomgermanguy|3 months ago
But at that point i feel like we are getting close to "everything that isn't a perfect Turing-machine is somewhat-stochastic" ;)
Edit: someone corrected me above, it does seem to matter more then I thought