Ask HN: Why TensorFlow instead of Theano for deep learning?

[+] asah|8 years ago|reply

TensorFlow automatically discovers and uses GPUs and multiple cores, and I'm assuming Google is working on better supporting multiple GPUs, which currently requires hacks/tweaking to get speedups (it's easy to 'use' them)

TensorFlow is a platform "winner" and approx 100% of all innovations will quickly be ported to TensorFlow - TBD which of the others will "keep up" with innovations as they continue to come out.

other recommendations:

- by default, TensorFlow allocates 100% of GPU RAM for each process. You'll want to control this: https://stackoverflow.com/questions/34199233/how-to-prevent-...

- Keras. yes, this. Dramatically reduces code by 2-10x, without loss of control AFAICT.

- cloud hardware. Pretty quickly, you'll want to scale and run multiple tests at once, and e.g. quickly backup & copy data, replicate system images, etc. I use Google Cloud Hosting and it's much easier (and cheaper) than AWS. Haven't tried Azure but heard good things. At least once, Google's internet bandwidth has saved hours waiting for data transfers.

[+] laingc|8 years ago|reply

Your comments are mostly valid, but I disagree about Keras. Although it's marvellous for patching something together quickly, if you want to branch out at all then it quickly becomes an absolute mess.

Far better, in my view, is to work with the newer Estimators API. It is almost as fool-proof as Keras, but instead of trying to be a framework as such, the Estimators/learn API essentially just wraps up some of the boilerplate that you need with raw tensorflow, and internally looks fairly similar to the code you might write yourself. Consequently, it preserves the composability of TF far better than Keras.

[+] deepnotderp|8 years ago|reply

Keras definitely means you lose control, but it's a tradeoff that's worth making in many cases.

[+] jasonb778|8 years ago|reply

[deleted]

[+] rryan|8 years ago|reply

Anyone who loves the Lisp concept of "code is data" will love TensorFlow.

Instead of coding imperatively, you write code to build a computation graph. The graph is a data structure that fully describes the computation you want to perform (e.g. training or inference of a machine learning model).

* That graph can be executed immediately, or stored for later.

* Since it's a serializable data structure, you can version it quite easily.

* You can deploy it to production without production having to depend on ANY of the code that built the graph, only the runtime necessary to execute it.

* You can run a compiler on it (such as XLA or TensorFlow's built in graph rewriter) to produce a more efficient version of the graph.

* In some circumstances, you can even compile the runtime away, producing a single .h/.o that you can link directly into e.g. a mobile app.

It's a beautiful and highly useful abstraction that allows TensorFlow to have both a great development and production story in one framework. Most frameworks only have a good story for either development or production.

If you are a machine learning researcher who doesn't need or care about deploying your work (i.e. mostly publishing papers), you may not want the overhead of having to deal with building a graph, and may prefer something that computes imperatively like PyTorch. If you are building products / services that use ML and developing/training your own models (as opposed to taking pre-trained models and using them), there is really no credible competitor to TensorFlow.

Disclaimer: I work at Google. I spend all day writing TensorFlow models. I'm not on the TensorFlow team nor do I speak for them or Google.

[+] fnl|8 years ago|reply

> If you are building products / services that use ML and developing/training your own models (as opposed to taking pre-trained models and using them), there is really no credible competitor to TensorFlow.

MXNet has amalgamation http://mxnet.io/how_to/smart_device.html#amalgamation-making...

CNTK provides a managed ("evaluation") library solution to deploy your models and embed them in C, C++, C#, Python, and even an experimental Java version. https://docs.microsoft.com/en-us/cognitive-toolkit/CNTK-Eval...

How's that not competitive to TF? MXNet's approach is a bit unwieldy, yes, but seems easily streamlined. And CNTK's deployment method looks perfectly fine. Note I haven't checked other DL libs, but it seems unreasonable that Microsoft and Amazon have no "competitive" solution for deployment.

[+] unknown|8 years ago|reply

[deleted]

[+] unknown|8 years ago|reply

[deleted]

[+] sirfz|8 years ago|reply

We've moved over to Tensorflow from Theano around a year ago. I'm a Software Engineer on the team and here's what I think are advantages from my POV:

1) Transition was fairly straightforward, both APIs' interfaces are more-or-less similar and share some design characteristics.

2) Having said that, TF's API is easier to use and without a doubt a lot easier to read.

3) Consistency: Deploying Theano in different environments surprised me on several occasions with different output compared to the training environment. TF is more consistent on this front (never had such issues).

4) Running multiprocessing with Theano + GPU is a disaster (due to forking) so I end up having to create process pools before initializing Theano. No such issues with TF.

5) TF provides many helpful operators (such as queues and batching ops) as well as monitoring tools (Tensorboard) and debugging tools.

6) Its development is extremely rapid, new releases every couple of months with a lot of improvements and new features every time.

In short, TF is what Theano should have been. A lot of new papers are being developed in TF as well so it helps to understand it.

[+] digitalzombie|8 years ago|reply

> 6) Its development is extremely rapid, new releases every couple of months with a lot of improvements and new features every time.

How stable is the api then?

I think google is a bit notorious for this (e.g. Angular vs Angular 2).

[+] paulsutter|8 years ago|reply

Fortunately, it's not an irrevocable decision like choosing a JavaScript framework. With deep learning you spend a lot of time considering a small amount of code.

We use several frameworks because sample code from different papers uses different frameworks. It's not that big of a deal.

[+] cs702|8 years ago|reply

The main reason to bet on TensorFlow is that it seems to have by far the greatest adoption of all frameworks, as evidenced by github statistics, HN polls, and other surveys:

* https://twitter.com/fchollet/status/765212287531495424

* https://news.ycombinator.com/item?id=12391744

* https://github.com/aymericdamien/TopDeepLearning

[+] aisofteng|8 years ago|reply

Selection bias could mean that you're substantively wrong.

[+] julsimon|8 years ago|reply

When it comes to scalability, Apache MXNet (http://mxnet.io/) is actually the best choice. Multi-GPU support and distributed training on multiple hosts are extremely easy to set up. It's also supported by Keras (still in beta, though).

[+] visarga|8 years ago|reply

TensorFlow is better for deployment. Pytorch is better for research. Theano/Keras is simpler to use and a little faster than TensorFlow

[+] DLEnthusiast|8 years ago|reply

"PyTorch is better for research" is a weird, unsubstantiated statement. The fact is that few serious researchers use PyTorch (and even those complain about it). It's mostly grad students in a handful of labs. The only researchers I know who use PyTorch have been from FaceBook, and that's because they were implicitly forced to use it (PyTorch is developed by FaceBook).

According to https://medium.com/@karpathy/icml-accepted-papers-institutio... , 3 of the top research labs in the world are DeepMind, Google Brain (and the rest of Google), and Microsoft Research. Let's see:

* DeepMind: TensorFlow

* Google Brain: TensorFlow

* Microsoft Research: CNTK

Ok, so what about academia? The top deep learning groups in academia are:

* Montreal: Theano

* Toronto: TensorFlow

* IDSIA: TensorFlow

So, what about the greater academic research community? Maybe we could get some data about who uses what by looking at the frameworks cited by researchers in their papers. Andrej did that: it's mainly TensorFlow and Caffe. https://medium.com/@karpathy/a-peek-at-trends-in-machine-lea...

[+] kasbah|8 years ago|reply

I am kind of new to all of this but as far as I understood you can use Keras with TensorFlow as well.

[+] curiousgal|8 years ago|reply

>Pytorch is better for research.

and yet a lot of researchers are using Caffe.

[+] ssivark|8 years ago|reply

An observation when taking a step back: The discussion about deep learning frameworks seems almost as complicated as the Javascript framework discussions a couple of years ago. Google and Facebook pushing their own frameworks (among other participants) also adds to the deja vu!

Why is the choice of framework such a big deal? Is it unreasonable to expect someone well-versed in one framework to be able to pick up another reasonably fast if/when collaborating with someone proficient in the latter ?

[+] laingc|8 years ago|reply

To answer your first question, I actually don't think it's a big deal.

It may become important if you end up having a ton of models running in production that need to be maintained and further developed, but in general for new applications I would say that substantially less than 5% of your time would (should!) be spent actually writing any code.

[+] anxman|8 years ago|reply

One other benefit with TensorFlow is that transitioning to cloud based processing on Google Cloud / Tensor Processing Units is seamless. It will turbo charge your training when compared to typical GPU performance.

Disclosure: Work for Google Cloud

[+] asah|8 years ago|reply

Any ETA on those TPUs? Stop teasing! ;-)

[+] nafizh|8 years ago|reply

Have you considered using Pytorch? Actually, many in the DL community thinks it is the next big thing as it is more intuitive and dynamic than Tensorflow.

[+] k__|8 years ago|reply

What's the best thing to build when starting TF? Like, the todo list of TF?

[+] mmv|8 years ago|reply

Tensorflow has a nice introduction tutorial using the MNIST dataset for recognition of handwritten digits.

Creating a small neural network and training it over the MNIST dataset is like the 'todo list' starter project of this kind of frameworks.

[+] bajaj|8 years ago|reply

You can train the mnist handwritten digit dataset

[+] massaman_yams|8 years ago|reply

I found this to be informative - https://svds.com/getting-started-deep-learning/

[+] ma2rten|8 years ago|reply

Here are some reasons why TensorFlow might be better:

* more widely used, more example code

* developed by a bigger team, likely to improve faster

* easier to deploy

* training with Cloud ML

* better support for distributed training

* no compile time (this can be long especially for RNNs)

[+] yuanchuan|8 years ago|reply

Tensorflow might not be the fastest in terms of computation speed, but it can be used from research to production with Tensorflow Serving.

As such you won't need to implement/convert your model in another format for usage.

[+] torbjorn|8 years ago|reply

tensorflow has tensorboard, a great application that allows your to explore your models in depth. it makes neural networks less of a black box.

[+] johnsmith21006|8 years ago|reply

Over 60k stars on github for TF. It won.

[+] ci5er|8 years ago|reply

If people weren't lemmings, that would be a valuable indicator. But, it turns out that...

[+] manis404|8 years ago|reply

Personally, I use a combination of Tensorflow and Appex frameworks. I find Theano simply lacking in features.

[+] chronic7ui|8 years ago|reply

r/Machine Learning

[+] he0001|8 years ago|reply

Use a AI program to answer that question!

[+] he0001|8 years ago|reply

Wouldn't it a bit reasonable to be able to use an AI program to tell which AI program is the best? That would be a some sort of Turing test by itself? And if not, so much for AI?

49 comments