top | item 37927018

(no title)

leiroigh | 2 years ago

That thread was in 0.6 days on my long-dead broadwell using DSFMT as a C library with afaiu hand-written intrinsic code for bulk generation of floats. We switched RNG to xoshiro in the meantime which is faster and generates 64 bit numbers natively (as opposed to dsfmt which generated 53 bit natively...). So don't trust that these old timings represent current julia; I updated the thread with current timings.

I'd be very happy if this can of worms could be reopened, but am currently not active enough in julia dev to champion it.

Also somebody would need figure out something very clever for AVX2 / NEON. (AVX512 has the required instructions)

Also I can't imagine the mess with GPU -- if rand statistics differ widely between CPU and GPU that's a no-go, and I don't know what works well on which GPUs.

discuss

order

No comments yet.