(no title)
brianchu | 8 years ago
2. Compiling Tensorflow from source on CPUs is a bit of a hassle but I have seen nice performance gains (10-20%) for LSTM tasks. I bet you would get even higher gains for CNNs since they're more parallelizable. (Note: I've never gotten the latest TF to work with Intel MKL).
3. I haven't fully tested this myself, but with the P100s you also have full support for half precision floats, which supposedly offer a huge speedup.
4. Also would have liked to see benchmarks of other frameworks like PyTorch, etc. I haven't used them myself but everything I've heard indicates that Tensorflow is often slower.
No comments yet.