top | item 44124031

(no title)

desdenova | 9 months ago

What makes models non-deterministic isn't the training algorithm, but the initial weights being random.

Training is reproducible only if, besides the pipeline and data, you also start from the same random weights.

discuss

That would fall under "Feed the same data in and you'll get the same weights out." Lots of deterministic algorithms use a random seed.

alfiedotwtf|9 months ago

So is there no “introduce randomness” at some step afterwards? If not, I would guess these models would be getting stuck in a local maxima

addaon|9 months ago

> If not, I would guess these models would be getting stuck in a local maxima

It sounds like you're referring to something like simulated annealing. Using that as an example, the fundamental requirement is to introduce arbitrary, uncorrelated steps -- there's no requirement that the steps be random, and the only potential advantage of using a random source is that it provides independence (lack of correlation) inherently; but in exchange, it makes testing and reproduction much harder. Basically every use of simulated annealing or similar I've run into uses pseudorandom numbers for this reason.