top | item 39795717

(no title)

nonagono | 1 year ago

It’s an introduction to a relatively niche new subfield. If I (an expert in the field but not the subfield) want to learn about differentiable programming, my only option before this monograph was to read through tens of random papers which use different presentation styles, terminology etc. Now I can read through the second half of this, around 100 pages, and jump back to the first half if there’s a prerequisite I don’t know.

That’s how most subfields are born. Assorted papers -> monograph -> textbook. The first arrow is defining the subfield as a discrete topic, which is immensely valuable. Only after you have that you can start optimizing for presentation to nonexperts.

discuss

order

blurbleblurble|1 year ago

The thing I like about this is that it frames all these optimization techniques + AD, etc. in the context of control flow and not just in the context of some trending neural network architecture. It doesn't assume you'll be using these techniques in a specific bubble, it gives the rest of us access to a broader perspective that experienced researchers have been slowly brewing for decades.

I've been trying to learn about applying gradient descent to a non-neural network problem, following a paper, and have found it very difficult to find introductory resources or code libraries that aren't explicitly geared toward training neural networks and running inference on them.

bsdpufferfish|1 year ago

Differentiable programming is hardly a "subfield" it can be explained in a paragraph if you know calculus well. If there is any subfield, it's in researching specific compiler optimizations.

nonagono|1 year ago

Well clearly not, since at least 100 pages of content here are specifically about differentiable programming and not prerequisites :)

More seriously, it's about doing the impossible. Formally, some functions are nondifferentiable, period. But it would be cool if we could actually "more or less" differentiate them. For that we'll necessarily need a bag of tricks which is now coalescing into "techniques" and "principles".

Cf. numerical analysis. It takes a page or two to set up your definitions and show that many functions are badly conditioned, period. And yet we still want to compute them, so we've been building the bag of tricks for almost a century now.