top | item 44145994

(no title)

Why do you think it is a huge tolerance ? (Just curious since it is not clear to me if that will lead to too much of reduction in numerical accuracy compared to the speedup)

discuss

creato|9 months ago

The point is, this amount of error is huge for fp32, but may be expected for fp16. But then why compare to fp32 performance baselines? An algorithm that gives you the accuracy of fp16 should be compared to an fp16 baseline, and this may not be (it probably is not) a speedup at all, it's likely much slower.

beyonddream|9 months ago

My original question is to understand why it is considered as huge tolerance and what should be considered low tolerance. I am suspecting the paper’s intention is not to compare apples and oranges. They are trying to optimize fp32 baseline by sometime resorting using fp16 as long as the resultant solution’s numerical accuracy is within thr tolerance level. They are going for the “low hanging fruits” type of optimization.