(no title)
berkut
|
1 year ago
A 256-item float32 LUT for 8-bit sRGB -> linear conversion is definitely still faster than doing the division live (I re-benchmarked it on Zen4 and Apple M3 last month), however floating point division with the newer microarchs is not as slow as it was on processors 10 years ago or so, so I can imagine using a much larger LUT cache is not worth it.
fp64|1 year ago