top | item 43521039

(no title)

Valk3_ | 11 months ago

I've only skimmed through both of them, so I might be entirely incorrect here, but isn't the essential approach a bit different for both? The MIT one emphasis not to view matrices as tables of entries, but instead as holistic mathematical objects. So when they perform the derivatives, they try to avoid the "element-wise" approach of differentiation, while the one by Parr et Howard seems to do the "element-wise" approach, although with some shortcuts.

discuss

godelski|11 months ago

I got the same impression as you the Bright, Edelman, and Johnson (MIT) notes seems more driven my mathematicians where I find the Parr and Howard paper wanting. Though I agree with them

  >  Note that you do not need to understand this material before you start learning to train and use deep learning in practice

I have an alternative version

  > You don't need to know math to train good models, but you do need to know math to know why your models are wrong.

Referencing "All models are wrong"

I think another part is that the Bright, Edleman, and Johnson paper are also introducing concepts such as Automatic Differentiation, Root Finding, Finite Difference Methods, and ODEs. With that in mind it is far more important to be coming from the approach where you are understanding structures.

I think there is an odd pushback against math in the ML world (I'm a ML researcher). Mostly because it is hard and there's a lot of success you can gain without it. But I don't think that should discourage people from learning math. And frankly, the math is extremely useful. If we're ever going to understand these models we're going to need to do a fuck ton more math. So best to get started sooner than later (if that's anyone's personal goal anyways)

Valk3_|11 months ago

Regarding the math in ML, what I would love to see (links if you have any) is a nuanced take on the matter, showing examples from both sides. Like in good faith discussing what contributions one can make with and without a strong math background in the ML world.

edit: On the math side I've encountered one that seemed unique, as I haven't seen anything like this elsewhere: https://irregular-rhomboid.github.io/2022/12/07/applied-math.... However, this only points out courses that he enrolled in his math education that he thinks is relevant to ML, each course is given a very short description and or motivation as to the usefulness it has to ML.

I like this concluding remarks:

Through my curriculum, I learned about a broad variety of subjects that provide useful ideas and intuitions when applied to ML. Arguably the most valuable thing I got out of it is a rough map of mathematics that I can use to navigate and learn more advanced topics on my own.

Having already been exposed to these ideas, I wasn’t confused when I encountered them in ML papers. Rather, I could leverage them to get intuition about the ML part.

Strictly speaking, the only math that is actually needed for ML is real analysis, linear algebra, probability and optimization. And even there, your mileage may vary. Everything else is helpful, because it provides additional language and intuition. But if you’re trying to tackle hard problems like alignment or actually getting a grasp on what large neural nets actually do, you need all the intuition you can get. If you’re already confused about the simple cases, you have no hope of deconfusing the complex ones.