Pyro: PyTorch-Based Deep Universal Probabilistic Programming

[+] ineedasername|8 years ago|reply

"Deep" probabilistic? A quick search & it seems the term "Deep Probabilistic" was coined by the package author. Which, hey, nice work and all, but "deep" in this context looks like pure marketing fluff.

Maybe it's not, I'm no longer cutting edge on this stuff-- my grad school days were a decade ago and the day job ::sigh:: doesn't require the more interesting stuff.

But I'm gonna get a bit "get off my lawn" on this one and say that, in my day (woohoo!), neural nets could be deep; they had hidden depths (& layers). Belief networks could be deep, and they were adding depth to learning too; Much of the "deep" stuff today seems to use that word the same way Tide & OxyClean have "deep" cleaning technology in their laundry detergent.

All of which is to say, this is a question from someone in the early stages of cruftiness, meant in good humor, to ask "What makes them there probablistics 'deep' ?" :)

[+] mendeza|8 years ago|reply

I think the goal is to incorporate modeling uncertainty and utilizing powerful bayesian inference techniques with deep learning. I understand the terminology sounds gimmicky, but the techniques are very useful and used in current research. An example of research of utilizing probabilistic inference with deep learning is trying model the uncertainty of CNN by approximating the uncertainty distribution of weights in dropout layers. If interested, google work done by Alex Kendall and Yarin Gal.

[+] fritzo|8 years ago|reply

Deep Probabilistic Programming

Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, David M. Blei

https://arxiv.org/abs/1701.03757

[+] juxtaposicion|8 years ago|reply

How does this compare to Edward, PyMC, Stan, et al? Is the primary distinction due to PyTorch’s imperative, dynamic programming?

[+] fritzo|8 years ago|reply

Edward: like Edward, Pyro is a deep probabilistic programming language that focuses on variational inference but supports composable inference algorithms. Pyro aims to be more dynamic (by using PyTorch) and universal (allowing recursion).

PyMC, Stan: Pyro embraces deep neural nets and currently focuses on variational inference. Pyro doesn't do MCMC yet. Whereas Stan models are written in the Stan language, Pyro models are just python programs with pyro.sample() statements.

One unique feature of Pyro is the probabilistic effect library that we use to build inference algorithms: http://docs.pyro.ai/advanced.html

[+] yodon|8 years ago|reply

Can someone eli5 probabilistic programming?

[+] zitterbewegung|8 years ago|reply

You got this new kind of variable. It’s special because instead of holding an object or a number it holds the dice. When you evaluate this variable it gives you a dice roll.

To use it to solve real systems you create a model where some of the variables are dice and you shake the system until you get an answer that satisfies you.

[+] obastani|8 years ago|reply

Consider the following (traditional) program:

  x = raw_input()
  x *= 2
  assert x >= 10

Now we can ask the question: "Assuming the assertion passed, what can we say about x?" In this simple example, we know that x >= 5, but in general the possible values of x may be much more complicated.

This is the kind of question probablistic programming is designed to answer, except instead of being an arbitrary, unknown value, x is specified as some distribution (say Gaussian). Then, the question becomes, "What is the posterior distribution of x, assuming all of the assertions pass?" In other words, figure out how likely different choices of x were, taking into account both the prior (i.e., that x is Gaussian) and the new information from the assertions.

For a simple example of why this is useful, suppose we have a program that generates random images (this is our prior). We also have some real photographs. We can assert that the random image generator should have a pretty good change of generating the real photographs. Then, the probablistic program "execution" will try and compute a new generator that creates more realistic images.

[+] nightski|8 years ago|reply

I honestly don't think it would be possible for a 5 year old to learn or even grasp the concept of probabilistic programming. So maybe wait until high school or if very gifted middle school where you have a more solid mathematical foundation. It's the process of inferring the parameters of a Bayesian probabilistic model from data and then making predictions with that model.

A probabilistic programming language lets you express the model in code, and then it performs inference automatically using various methods. Note doing it manually may produce better results but it is a very involved and time consuming process.

[+] unclesaamm|8 years ago|reply

My impressions after familiarizing myself with Stan, BUGS, JAGS, and a few Python libraries for Bayesian inference:

Probabilistic programming languages are designed to specify and evaluate statistical models. They usually sit somewhere between declarative and imperative languages. They're basically DSLs in the sense that they come with built-in primitives for various statistical distributions and linkage functions, and the model is sometimes but not always lazy evaluated, but they also often have procedural components like variable assignment and loops, and in some software like Stan, the model has to be specified in a certain order since it is evaluated procedurally.

The languages are also de facto coupled with the engines that run them -- I don't know of any probabilistic languages that have been formalized without a corresponding sampling engine, though there have been cases where a language is "forked", and separate engines built for the same language (like WinBUGS and OpenBUGS).

[+] pinouchon|8 years ago|reply

This is not a simple explanation, but one I quite like (by Dan Roy) https://www.youtube.com/watch?v=TFXcVlKqPlM

[+] alex_hirner|8 years ago|reply

It maps observations where you have a clue about how they were created (priors) to a stochastic function. So quite literally, you end up with a probabilistic program.

There are many solutions for each prior. Priors can for example be expressed as computational graphs with random variables.

[+] tbenst|8 years ago|reply

How flexible is this compared to Church/Venture, Webppl, Anglican, etc? Does it support recursively-defined generative processes?

Edit: nvm, Noah Goodman is behind this, who created Webppl. This looks super flexible and awesome, congrats all!

[+] nl|8 years ago|reply

The combination of probalistic programming and deep learning is pretty interesting to me because that's what I have going on in two of my work projects.

What we do is have features built using deep learning models, then use that extract simple linear or categorical features which we condition our probalistic model on.

We've found it quite hard to use very high numbers of variables in the probalistic model.

Has anyone found a better way of doing this?

[+] wakkaflokka|8 years ago|reply

Do you have a recommended resource on how to feature engineer with DNNs for use as inputs in other models?

[+] orbifold|8 years ago|reply

It was bound to happen, the dynamic control flow of pytorch makes this really interesting compared to Edward.

[+] catchmeifyoucan|8 years ago|reply

Blocked on our corporate network at CapitalOne as "suspicious"

[+] Diederich|8 years ago|reply

Pyro is a universal probabilistic programming language (PPL) written in Python and supported by PyTorch on the backend. Pyro enables flexible and expressive deep probabilistic modeling, unifying the best of modern deep learning and Bayesian modeling. It was designed with these key principles:

Universal: Pyro can represent any computable probability distribution. Scalable: Pyro scales to large data sets with little overhead. Minimal: Pyro is implemented with a small core of powerful, composable abstractions. Flexible: Pyro aims for automation when you want it, control when you need it.

Check out the blog post for more background or dive into the tutorials.

https://eng.uber.com/pyro/

[+] yorwba|8 years ago|reply

Likely due to the Anguilla top-level domain. I think it's quite silly to block websites based on the country of their registrar, but unfortunately that kind of thinking is quite common.

[+] traverseda|8 years ago|reply

How does it compare to Pyro, the python-remote-objects library?

[+] almstimplmntd|8 years ago|reply

This post is about Pyro (probabilistic programming framework, with a clever logo "pi rho"), not to be confused with https://github.com/irmen/Pyro4

Completely independent projects, just a name collision.

[+] pinouchon|8 years ago|reply

They are similar like java and javascript are similar

[+] l5870uoo9y|8 years ago|reply

Anyone have a qualified opinion on how Pytorch compares to Tensorflow?

[+] lowpro|8 years ago|reply

Siraj Raval covers PyTorch very well in a 5 min video (https://youtu.be/nbJ-2G2GXL0).

It comes down to design decisions, of which I'm not qualified to go into. This article made the front page last week, about the downsides to Tensorflow which people rarely talk about: http://nicodjimenez.github.io/2017/10/08/tensorflow.html

And this interview with a Tensorflow engineer (10 mins) explains a little bit about those design decisions (https://youtu.be/axRHotkkTVI).

[+] amelius|8 years ago|reply

I guess whatever qualities both platforms have does not really matter compared to how much traction they get. If the world converges to one platform, then you should probably go with that platform, no matter how good the other platform is. The reason is that things are just moving too fast, and you don't want to spend your time porting new research from one platform to the other.

That said, I'm curious why PyTorch (or specifically Autograd) couldn't have been built on top of TF.

[+] newlyretired|8 years ago|reply

http://www.fast.ai/2017/09/08/introducing-pytorch-for-fastai...

[+] steev|8 years ago|reply

I would roughly classify them based on use cases:

* If you are working on research such as optimization or other improvements on the algorithms of training neural networks, PyTorch is a better option as it is (in my experience) much more understandable and easily modified.

* If you are experimenting with network architectures and aren't going to be mucking around with the internals (e.g., developing a new optimization algorithm), Tensorflow is a better option.

[+] polskibus|8 years ago|reply

Is there a tensorflow based equivalent?

[+] julien_c|8 years ago|reply

Maybe http://edwardlib.org/

[+] bj0|8 years ago|reply

There's already a really cool python project called Pyro (python remote objects): https://pyro4.readthedocs.io/en/stable/intro.html

I haven't used it since I was in undergrad (>10 years) where I used it to communicate between nodes on a small cluster, but it made RPC really easy.

[+] zero_iq|8 years ago|reply

Indeed. It was (is?) pretty well known, but I've not heard it mentioned for a long time. With all the fashionable modern RPC and serialisations around nowadays, perhaps the original Pyro is now obscure enough that the name can be reused? Ideally, though, it would be nice to know for sure that an existing project is considered obsolete before causing any confusion.

41 comments