top | item 22546974

(no title)

fcharton | 6 years ago

Textbook problems are usually short, with short solutions, and demonstrating one specific rule. They are better handled by classical (rule-based) tools. Deep learning tools would either memorize them or resort to a rule based sub-module.

For integrals, solvability depends on the function set you work with. Since we use elementary functions on the real domain, a lot of integrals have no solution. We could have gone for a larger set (adding erf, the Fresnels, up to Liouvillian functions). This would mean more solvable cases.

As for the engineering distribution, no one knows what it is. The best we can do is to generate a broader training set, knowing that it will generalize better (this is the key takeaway of our appendix). BWD+IBP is a step in this direction, but to progress further, we need a better understanding of the problem space, and issues related to simplification. We are working on this now.

discuss

whatshisface|6 years ago

>Since we use elementary functions on the real domain, a lot of integrals have no solution. We could have gone for a larger set (adding erf, the Fresnels, up to Liouvillian functions). This would mean more solvable cases.

Here is something to think about if you want more solvable cases: even in the case of sine and cosine, solving a differential equation really means reducing it to a combination of the solutions to simpler differential equations. The sine function can be defined as the solution to a particular differential equation, as can all of the fancier functions. So in a sense it's kind of like factorization, where you have "prime" equations whose only solution is a transcendental function defined as being their solution, and "composite" equations whose solutions can be written as a combination of solutions to "prime" equations. So really all of the rare functions belong to the same general scheme.

wendyshu|6 years ago

Why not test on a set of problems that come up in practice rather than generated by an artificial distribution?

fcharton|6 years ago

For training, you need a generator because you want millions of solved examples for deep learning to work.

At test time, you usually want a test set from the same distribution as the training data (or at least related to it in some controllable way), or it becomes very difficult to interpret the results.

Suppose my test set come from a different and unknown distribution (real problems sampled in some way).

If I get good results, is it because the training worked, or because the test set was "comparatively easy"? If I get bad results, is it because the model did not learn, or because the test set was too far away from the training examples?