top | item 17535218

TherML – Thermodynamics of Machine Learning

154 points| selimthegrim | 7 years ago |arxiv.org

24 comments

order
[+] mlthoughts2018|7 years ago|reply
Does anyone really find it surprising that you can do contortions around KL divergence among candidate probabilistic models to the point of shoe-horning some analogue of thermodynamic laws and equilibria into machine learning? I don’t find this either surprising, illuminating, or interesting.

It reminds me of the “information geometry of boosting” section of [0].

[0]: https://pdfs.semanticscholar.org/2fad/679058e465fc07f942cfed...

[+] danielmorozoff|7 years ago|reply
Fundamentally, thermodynamics and information theory share a slew of similar ideas. Because in a lot of ways information theory came out of a thermodynamic thought process applied to signals. Even more recent work in ml like Boltzmann machines are directly taken from this overlap. That being said stronger theoretical connections are well worth the effort as these fields do have some different ideas that if shown to be similar can yield discoveries and push our understanding forward. Some thoughts that come to mind are compression, model generalizability/ convergence.
[+] WhitneyLand|7 years ago|reply
Does anyone? What % of people make any connection at all between the fields?

Firstly this is not a CS or Physics only forum so I would guess the subject could easily have never come up for many (if so for the simpler/non-ML connections see https://en.wikipedia.org/wiki/Entropy_in_thermodynamics_and_...).

Secondly I think general awareness beyond formal education has increased in a non-linear way only over the last 20 years only because there has been so much science press around the celebrity of Steven Hawking.

His celebrity alone raises awareness, however it was compounded by drama depicted between himself and others and some really interesting observations along the way (https://en.wikipedia.org/wiki/Holographic_principle).

When all of this publicity has various inevitable Kevin Bacon trails back to the fundamental connections, I’d say the population is not only not everyone but could be much less if things unfolded a little differently.

[+] orbifold|7 years ago|reply
Whether you find something like this surprising largely depends on your prior knowledge. There are quite different sets of people that work with statistics, theoretical physicists are one of them. So a recent advance in one field (machine learning), will inevitably draw in the hyenas from other fields, especially if one of their favorite toys (building increasingly complicated high energy particle physics models) has been largely taken away from them given the recent zero results at the LHC. Which is how you get review articles like this one (https://arxiv.org/abs/1803.08823).
[+] c3534l|7 years ago|reply
I doubt even 1% of the population can decode that sentence, let alone find it so blindingly obvious it need not be said.
[+] mlazos|7 years ago|reply
This is why I enjoyed a computational probability class in college - the professor introduced all of the concepts starting with a problem in physics, a lot of methods in modern stats were built to explain physical behavior of particles with certain constraints.