top | item 41409785

(no title)

sischoel | 1 year ago

The CUDA documenation tells me that there are more performant but less precise trigonometric functions: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index....

Do you know if that hardware pipeline works only for these intrinsic variants?

discuss

order

dahart|1 year ago

Yep, these intrinsics are what I was referring to, and yes the software versions won’t use the hardware trig unit, they’ll be written using an approximating spline and/or Newton’s method, I would assume, probably mostly using adds and multiplies. Note the loss of precision with these fast-math intrinsics isn’t very much, it’s usually like 1 or 2 bits at most.

mabster|1 year ago

I couldn't find much information on those. I assume that they don't include range reduction?