top | item 39713255

What Every User Should Know About Mixed Precision Training in PyTorch (2022)

69 points| yu3zhou4 | 2 years ago |pytorch.org | reply

24 comments

[+] brutus1213|2 years ago|reply

Is there any tooling that lets one profile the individual parts of the compute graph? In CUDA land, I can imagine how to build such a thing. No clue how it would work on Mobile.

[+] hatthew|2 years ago|reply

Is the pytorch profiler what you're thinking of?

https://pytorch.org/tutorials/intermediate/tensorboard_profi...

[+] jerpint|2 years ago|reply

I wonder if future versions of PyTorch will automatically apply mixed precision if not specified, the article makes it seem like a no brainer to use them by default

[+] hatthew|2 years ago|reply

It can still be complicated by many things.

Some parts of a module may not work well in lower precision and need to be in higher precision. If you ever create new tensors within the forward pass, you need to manually adjust your code to automatically use the right datatype. You still need to figure out which low-precision dtype you want to use in your model (and perhaps use different ones in different parts of your model). Etc.

[+] aborsy|2 years ago|reply

Is there a reason for choosing PyTorch over tensorflow?

[+] NeutralForest|2 years ago|reply

TF is pretty much dead. The examples often do not work, the docs are not up to date and I don't think any recent paper/projects use TF so you'll also find a better community and better resources around Pytorch.

[+] dog436zkj3p7|2 years ago|reply

Yes, many. At this point you should be asking the question whether there's ever a reason to choose tensorflow at all.

[+] pcwelder|2 years ago|reply

Tensorflow was also riddled with bugs, which worsened after Keras was made as default. Things didn't work as expected even in some regular cases.

I switched to Pytorch after I encountered this bug in a very normal use case back in v1.13 https://colab.research.google.com/drive/1D-kgD7NiRXTNTNwVr18...

I've never encountered such a bug in Pytorch in the last 4-5 years.

[+] david-gpu|2 years ago|reply

Back when I worked on that stuff, Pytorch was a joy to use and TF was a pain. I remember Pytorch code being easier to understand and debug.

As a researcher, Pytorch was also much easier to tinker with, which is perhaps a factor that explains why it rapidly gained popularity in academia.

[+] fransje26|2 years ago|reply

Debugging is a lot easier in PyTorch. Although you can debug the compiled graph in Tensorflow, from experience, the local state might not be the same in debug mode as in compiled mode.

Also, I've encountered strange performance regression issues with the newest Docker releases of Tensorflow, with 10x slow-downs compared to previous minor releases. And the docker version was always slower than the local version. Something something Nvidia & CUDA I guess. I had not performance differences with PyTorch when using docker.

It should be said that Tensorflow was generally 10 to 20% faster for similar models. But that could be down to my ineptitude.

[+] blitzar|2 years ago|reply

I thought tensorflow was dead and PyTorch the new king.

[+] armcat|2 years ago|reply

One reason is that overall there are more PyTorch based ML projects out there, which translates to larger exploration space and wider support base. Around the beginning of 2021 PyTorch overtook TensorFlow as the ML framework of choice, see https://trends.google.com/trends/explore?date=today%205-y&q=...

[+] logicchains|2 years ago|reply

PyTorch has a very good record of backwards compatibility compared to Tensorflow; your code is much less likely to be broken/deprecated if you use PyTorch.

[+] skadamat|2 years ago|reply

PyTorch has taken over the field w.r.t. popularity. TF unfortunately isn't that popular outside of Google

[+] brutus1213|2 years ago|reply

On mobile devices, specifically Android, there are some benefits. Also, in the embedded/tinyml space.

[+] dist-epoch|2 years ago|reply

That was a valid question ... in 2018

[+] imtringued|2 years ago|reply

Pytorch is both faster in terms of performance and ease of development.

[+] olives|2 years ago|reply

Is it just me or does the font on the pytorch website look incredibly strange? (I'm on chrome)

[+] Shrezzing|2 years ago|reply

At 100% zoom on Firefox and Edge, the tops & bottoms of lower case letters have some very strange thinning/bolding going on. The top of the `e` char in "Syed Ahmed" is thinner (maybe 1px in height) than the lower curve of the same `e` character (maybe 3px in height). It looks like they have different font-weights for the top and bottom of the characters somehow.

Zooming in to 125% the effect goes away, and the font-weight at the top and bottom appear equally thick.

[+] keithalewis|2 years ago|reply

It is incredibly strange that you are asking the question here instead of asking the pytorch website maintainers.

[+] hatthew|2 years ago|reply

Yeah I'm getting that on windows, but not mac (both chrome, standard low-ppi displays with no scaling).

[+] blitzar|2 years ago|reply

Same for me at 100%. Something to do with pixel sizes etc making it look like is drawn with roblox.