top | item 18794026

(no title)

amuresan | 7 years ago

Could someone with a better understanding of the topic give a brief example of how this can be used in practice?

discuss

ivansavz|7 years ago

This is a theory paper so no immediate practical applications. Potential applications would be generalizing derivative calculations used in currently ML libraries, see https://en.wikipedia.org/wiki/Automatic_differentiation and http://www.cs.nuim.ie/~gunes/files/Baydin-MSR-Slides-2016020...

Useful quote from paper: "We have been inspired by the development of differentiable programming to formalize a theoretical model, that encompasses the ideas underlying differentiable programming and provides a more general setting for investigations of differentiable programs. The presented theoretical model enables analytic investigations of differentiable programs through algebraic tools, that are closer to the field of programming; i.e. the presented operators can take the same role as higher order functions in functional programming."

Someone|7 years ago

Let’s say you wrote a function f(x) and want to know where it is zero.

The traditional way is Newton’s method (https://en.wikipedia.org/wiki/Newton's_method)

That requires you to compute f’(x) at various points.

You could either pick some small epsilon and approximate it as (f(x+ε)-f(x-ε))/2ε or implement a function that symbolically computes it (likely returning better approximations). This kind of stuff gives you the latter for free.

That’s less work (more so if you’re continuously tweaking the definition of f(x)) and less error prone (once the bugs have been ironed out :-) ).

Now, you’ll ask “who ever wants to compute zeroes?”. Turns out that’s basically (1) what machine learning does. It searches for a maximum, but that can be done using Newton’s method (https://en.wikipedia.org/wiki/Newton's_method#Minimization_a...)

(One problem that these kinds of approaches overlook is that of numerical stability. I fear many uses will see implementers help their automatic differentiation tooling to improve numerical stability)

(1) yes, I’m painting in very broad strokes, losing detail.

75dvtwin|7 years ago

@amuresan, here is an attempt to answer your question, from a non-expert:

---

In a variety of problems, one needs to compute minimum or maximum of some numeric function. Later, the result of the above, is used to construct 'optimal' path, 'optimal solution'/etc.

As an example, if one is writing a database optimization (eg what query path to take will (likely), take less time, or less resources). Or a distributed task scheduled...

Min/Max of a computation is also a function (represented as a derivative).

So to compute a derivative, we have to take a function that can be written in Math notation (symbolic differentiation is needed ), or as function in some programming language (automatic differentiation is needed)

The transformation program (the one that finds the derivative of another computation express in some programming language) is what this paper is covering.

How do we reason about these transformations? are they correct ?, how much memory will they need to store intermediate states? can some of these states be collapsed without negatively affecting results?

So this paper is coming up with formalisms akin to operational calculus to describe these programs, and provides apparatus

In their conclusion paragraph:

".... In this paper we presented a theoretical model for diﬀerentiable programming.

Throughout the course of the paper we have shown the model to be a complete description of diﬀerentiable programming.

Furthermore, the innate algebraic structure of the framework supplements the descriptive power of a language with the ability to reason about the programs it implements, by way of operational calculus. We believe operational calculus has a place in the evolution of computer science, where languages are to be endowed with algebraic constructs that hold power over analytic properties of the programs they implement. …"

Also, I think, that methods to formally describe classes of programs, are really good and useful.

I am sure, some day, methods like this, and the whole machinery of Category theory will allow us to construct with 'correctness guarantees' and reason about -- not just numeric computations, but also the typical 'data donkey' transformations between data stores/message queues that are so often implemented by hand in the enterprises.