top | item 34390609

(no title)

dwiel | 3 years ago

Right, the claim was that "Neural networks give probability estimates. Bayesian methods give us probability estimates AND uncertainty" which presents a false dichotomy. I think we agree.

discuss

steppi|3 years ago

Ah yes, got you. It is a false dichotomy because it neglects that there’s such a thing as Bayesian neural networks. Also, taking ensembles of ordinary neural networks with random initializations approximates Bayesian inference in a sense and this is relatively well known I think.

datastoat|3 years ago

Indeed, there are Bayesian neural networks and there are non-Bayesian neural networks, and I shouldn't have implied that all neural networks are non-Bayesian.

I'm just trying to point out that there is a dichotomy between the Bayesian and the non-Bayesian, and that the standard neural network models are non-Bayesian, and that we need Bayesianism (or something like it) to talk about (epistemic) uncertainty.

Standard neural networks are non-Bayesian, because they do not treat the neural network parameters as random variables. This includes most of the examples that have been mentioned in this thread: classifiers (which output a probability distribution over labels), networks that estimate mean and variance, and VAEs (which use Bayes's rule for the latent variable but not for the model parameters). These networks all deal with probability distributions, but that's not enough for us to call them Bayesian.

Bayesian neural networks are easy, in principle -- if we treat the edge weights of a neural network as having a distribution, then the entire neural network is Bayesian. And as you say these can be approximated, e.g. by using dropout at inference time [0], or by careful use of ensemble methods [1].

[0] https://arxiv.org/abs/1506.02142

Quote: "Deep learning tools have gained tremendous attention in applied machine learning. However such tools for regression and classification do not capture model uncertainty."

[1] https://arxiv.org/abs/1810.05546

Quote: "Ensembling NNs provides an easily implementable, scalable method for uncertainty quantification, however, it has been criticised for not being Bayesian."

dwiel|3 years ago

Yeah right, in my experience I haven't needed as many networks in the ensemble as I first assumed. This paper [1] suggests 5-10, but in practice I've found only 3 has often been sufficient.

[1] https://arxiv.org/abs/1811.12188