top | item 26769099

(no title)

warsheep | 4 years ago

The solution you suggest is irrelevant to the issue mentioned in the article. Even if you use np.random.RandomState, or any other "explicit RNG state", that state will still be copied in the fork() call.

The post just stresses that one should be careful when using random states and multiprocessing, so you should either reseed after forking or using multiprocess/multithread-aware RNG API.

discuss

order

nemetroid|4 years ago

I believe the point is that the error will be more obvious if the state is passed around explicitly.

acdha|4 years ago

Possibly but this is the kind of boilerplate which people tend to ignore, especially when a program is non-trivial. It’s really easy to notice if you’re doing something like `seed_rng(); fork();` but once there’s distance and more than one thing being passed around I’d be surprised if you didn’t find the same pattern, perhaps a bit less common.

Fundamentally, there two problems: fork() is a performance trick to try to do setup only once and seeding an RNG is a type of setup which isn’t intuitively obvious can’t be optimized that way; and if most people learn from a tutorial or quick start this is exactly the kind of important but non core issue people omit or ignore in that context.