(no title)
Valk3_
|
11 months ago
I've only skimmed through both of them, so I might be entirely incorrect here, but isn't the essential approach a bit different for both? The MIT one emphasis not to view matrices as tables of entries, but instead as holistic mathematical objects. So when they perform the derivatives, they try to avoid the "element-wise" approach of differentiation, while the one by Parr et Howard seems to do the "element-wise" approach, although with some shortcuts.
godelski|11 months ago
I think another part is that the Bright, Edleman, and Johnson paper are also introducing concepts such as Automatic Differentiation, Root Finding, Finite Difference Methods, and ODEs. With that in mind it is far more important to be coming from the approach where you are understanding structures.
I think there is an odd pushback against math in the ML world (I'm a ML researcher). Mostly because it is hard and there's a lot of success you can gain without it. But I don't think that should discourage people from learning math. And frankly, the math is extremely useful. If we're ever going to understand these models we're going to need to do a fuck ton more math. So best to get started sooner than later (if that's anyone's personal goal anyways)
Valk3_|11 months ago
edit: On the math side I've encountered one that seemed unique, as I haven't seen anything like this elsewhere: https://irregular-rhomboid.github.io/2022/12/07/applied-math.... However, this only points out courses that he enrolled in his math education that he thinks is relevant to ML, each course is given a very short description and or motivation as to the usefulness it has to ML.
I like this concluding remarks:
Through my curriculum, I learned about a broad variety of subjects that provide useful ideas and intuitions when applied to ML. Arguably the most valuable thing I got out of it is a rough map of mathematics that I can use to navigate and learn more advanced topics on my own.
Having already been exposed to these ideas, I wasn’t confused when I encountered them in ML papers. Rather, I could leverage them to get intuition about the ML part.
Strictly speaking, the only math that is actually needed for ML is real analysis, linear algebra, probability and optimization. And even there, your mileage may vary. Everything else is helpful, because it provides additional language and intuition. But if you’re trying to tackle hard problems like alignment or actually getting a grasp on what large neural nets actually do, you need all the intuition you can get. If you’re already confused about the simple cases, you have no hope of deconfusing the complex ones.