top | item 20570025

Math Basics for Computer Science and Machine Learning [pdf]

883 points| oldgun | 6 years ago |cis.upenn.edu

118 comments

order
[+] xiaolingxiao|6 years ago|reply
The professor who wrote this is Jean Gallier, and I had him for advanced linear algebra at Penn. I am also pretty close to him in so far as a student can be close to a professor. On a personal note, he is one of the funniest professor I've had, and all math professors are characters.

For the people who are interested in ML, the thing to remember here is that he is a Serious mathematician, and he values rigor and in-depth understanding above all. A lot of his three star homework problems were basically impossible. He writes books first and foremost so he can understand things better. In math books, there's the book you first read when you don't understand something, then the book you read when you understand everything. This is book in the link.

for linear algebra, this:https://www.amazon.com/Introduction-Linear-Algebra-Gilbert-S...)

[+] Gene_Parmesan|6 years ago|reply
From the start of Chapter 2:

"In the following four chapters, the basic algebraic structures (groups, rings, fields, vectorspaces) are reviewed, with a major emphasis on vector spaces. Basic notions of linear algebra such as vector spaces, subspaces, linear combinations, linear independence, [...], dual spaces,hyperplanes, transpose of a linear maps, are reviewed."

If anyone needs to start even earlier than this, I've actually found "3D Math Basics for Graphics and Game Development" to be a good true intro for linear algebra-related stuff. I think this would probably hold even if your primary interest is something other than graphics/game dev. Some of the text in that book's intro is a little cringey with its reliance on kind of juvenile game references, but I didn't find that sort of writing continuing during the actual text. So just push past that stuff.

I got a copy of it to act as a refresher before diving into Real-Time Collision Detection since it's been quite a long time since formal math for me (as in, high school, because I'm self-taught in CS). I've managed to make up a lot of ground by working hard and finding classes to audit online (Strang's linear alg course on OCW is a good one), but I have found that depressingly few math texts which claim to be "introductory" are actually truly introductory.

This isn't a slight against the linked work, I absolutely love when profs make resources such as this freely available.

"How to Prove It" and "Book of Proof" are also great intros to formal math, if less immediately practical.

[+] hx2a|6 years ago|reply
> If anyone needs to start even earlier than this, I've actually found "3D Math Basics for Graphics and Game Development" to be a good true intro for linear algebra-related stuff.

Did you mean to write "3D Math Primer for Graphics and Game Development" [1]? If you did, I agree 100%. I got a lot out of this book and was able to put it to good use for several projects.

[1] https://www.amazon.com/Math-Primer-Graphics-Game-Development...

[+] codesushi42|6 years ago|reply
I would disagree about the gamedev book reference, unless you are referring to the real basics of linear algebra.

The really important concepts for ML are least squares, eigenvalues and vectors, and SVD. Those concepts are not very relevant to game programming.

Well, least squares can be solved with projection, which is relevant for converting between coordinate spaces. But game dev isn't going to give you that intuition.

[+] commandlinefan|6 years ago|reply
Not to look a gift horse in the mouth, but I'm always irritated by math books that include practice exercises but no answers in the back of the book to check your work against.
[+] Liquix|6 years ago|reply
Do you have a link to that book perchance?
[+] jointpdf|6 years ago|reply
“Math Basics” is quite the misnomer—it gives the impression that one would need to study all of the contents of this book to be an effective practitioner in CS or ML. Memorizing every definition and theorem in this book would be neither necessary nor sufficient for that purpose.

Keep in mind it can take an hour, and sometimes way more, to really absorb a single page of a math book like this (do the math). This is more of a reference text.

[+] worldsayshi|6 years ago|reply
A friend with a math msc aluded at this, that it's kind of a math meme to call material "basic" or "introduction to" for rather advanced stuff.

I feel like it's some kind of misguided intellectual humility. Kind of feels vaguely related to how so many Haskell packages are version "0.*".

[+] PopeDotNinja|6 years ago|reply
This is book 1962 pages long. If this is basic, how long is the advanced book?!
[+] duckqlz|6 years ago|reply
> “Math Basics” is quite the misnomer—it gives the impression that one would need to study all of the contents of this book to be an effective practitioner in CS or ML.

To me basics mean that if you study this entire book you won’t be able to understand ML otherwise it would say comprehensive. Furthermore the math presented in this book are all taught in 1st year courses for most CS programs I’ve encountered.

> Keep in mind it can take an hour, and sometimes way more, to really absorb a single page of a math.

Learning is a personal experience and happens at differing rates for different people. While I do agree this book is rather terse and would serve as a good reference any added explanations around the proofs would force a split to multiple publications so I can see why the authors chose to present it in the way they did.

Overall I have found this text easy to digest and well formulated and thank the authors and poster.

[+] collyw|6 years ago|reply
Been writing software for over 15 years and its very rarely that I actually need to use any Maths. Occasionally yes, but very rarely.
[+] melodrama|6 years ago|reply
Great book.

I think it's a good time to mention a couple of nice books (related)

1. Elementary intro to math of machine learning [0]. Its style is a bit less austere than that of OP's. It also has a chapter on probability. It could possible serve as a great prequel to the book linked in the OP.

2. The book on probability related topics of general data science: high-dimensional geometry, random walks, Markov chains, random graphs, various related algorithms etc [1]

3. Support for people who'd like to read books like the one linked in the OP, but never seen any kind of higher math before [2]. This book has a cover that screams trashy book extremely skimpy on actual info (anyone who reads a lot of tech books knows what I am talking about), but surprisingly,it contains everything it says it does and in great detail. Not even actual math textbooks (say, Springer) are usually written with this much detail. Author likes to add bullet point style elaboration to almost every definition and theorem which is (almost) never the case with gazillions of books usually titled "Abstract Algebra", "Real Analysis", "Complex Analysis" etc. Some such books sometimes attach words like "friendly" to their title (say, "Friendly Measure Theory For Idiots") and still do not rise to the occasion. Worse yet, a ton (if not most) of these books are exact clones of each other with different author names attached. The linked book doesn't suffer from any of these problems.

[0] Mathematics For Machine Learning by Deisentoth, Faisal, Ong

https://mml-book.github.io/book/mml-book.pdf

[1] Foundations Of Data Science By Blum, Hopcroft, Kannan

http://www.cs.cornell.edu/jeh/book%20no%20so;utions%20March%...

2] Pure Mathematics for Beginners: A Rigorous Introduction to Logic, Set Theory, Abstract Algebra, Number Theory, Real Analysis, Topology, Complex Analysis, and Linear Algebra by Steve Warner

https://www.amazon.com/Pure-Mathematics-Beginners-Rigorous-I...

[+] iamcreasy|6 years ago|reply
I'll check out the last one. Currently I am teaching myself real analysis.
[+] SKILNER|6 years ago|reply
Very first sentence of 2.1 is full of notation, symbols and terms that I, as a prospective student, might not understand.

So many teachers seem incapable of stepping outside their sphere of knowledge and seeing what they know and others do not. And so much work went into this.

[+] bradlys|6 years ago|reply
Even as someone who did their undergrad in math, it's a bit tough to digest. I'd have to look up a lot of the terms again. It's been five years since I was in college. Shows how little I've used it all since I graduated.

It definitely looks more like, "math basics" for x field. Kinda like "automotive basics" for Honda or Ford vehicles. Where it's presumed that you know a lot of automotive lingo to begin with and you just need to know what spark plugs go with what engine. And not, "what is a spark plug?"

[+] baddox|6 years ago|reply
It sounds like you correctly identified a mismatch between audience and material, but I don’t think it’s appropriate to describe the teachers/authors as having made an error. Why would you assume that you are the intended audience of this text and thus that the authors have made some mistake?
[+] svat|6 years ago|reply
As mentioned in the paragraph above it, the chapter is a review, i.e. it's trying to quickly refresh the reader's memory on things they're already supposed to know; it's not trying to teach something new. So the sentence (“The set R of real numbers has two operations +: R × R → R (addition) and ∗: R × R → R (multiplication) satisfying properties that make R into an abelian group under +, and R − {0} = R∗ into an abelian group under ∗”) seems fine for that purpose.

If you look at any of the later chapters that are trying to teach something new, they are much more gentle and motivate the topic of that chapter: see e.g. “24.1 Affine Spaces” on page 759, or “26.1 Why Projective Spaces?” on page 823, etc.

Other chapters that are meant as a review are similarly terse and quick to the point (like Chapter 2), e.g. Chapter 37 “Topology” on page 1287.

I think it's good when books make conscious choices about what they're teaching versus assuming as a prerequisite (and communicate it to the reader, by using terms like “reviewed” — presumably the yet-to-be-written Introduction chapter will also mention this more explicitly).

[+] lone_haxx0r|6 years ago|reply
On the other hand, it's a perfect refresher for those of us who "know" this, but somehow forgot most of it.
[+] graycat|6 years ago|reply
Here's what he is doing. He wants to start with the set of real numbers, intuitively the points on the line, usually denoted by R, maybe typed in some special font.

Then he wants to define, say, addition of real numbers. So, given two real numbers, x and y, that might be equal, he wants to define x + y.

So, here he wants to regard addition, that is, +, as an operation. Then, as is usual for defining operations, he wants an operation to be just a special case of a function. So, he wants to call + a function. So, + will be a function of two variables, say, x and y. With usual function notation we will have

+(x,y) = x + y

The set of all (x,y) is the domain of the function, and the set of all x + y is the range.

So, that defines the function + except commonly in pure math we want to be explicit about the range and domain of the function.

For function +, the range is just the set of all pairs (x,y) with x and y in R. That set is also the set theory Cartesian product of set R with itself and written R x R. So, the domain of + is R x R. The range is just R. Then to be explicit about the range and domain of function +, we can write

+: R x R --> R

which says that + is a function with range R x R and domain R.

We learned how to add in, what, kindergarten? So, why make this so complicated?

Well, he wants to regard the real numbers as just one example of lots of different algebraic systems, e.g., groups, fields, vector spaces, and much more, with lots of operations and, possibly, more that could be defined. E.g., later in his book he will want to add vectors and matrices, take an inner product of two vectors, and multiply two matrices.

So, back to addition on the real numbers, he wants to regard that as just a special case of an operation on an algebraic system.

IMHO there's not much benefit for making adding two real numbers look so complicated.

Whatever he did in that chapter for defining addition on the reals, soon he is discussing matrix multiplication with no definition at all -- assuming the reader already understands that, that is defined and discussed many pages later in his book.

So, in his notation

+: R x R --> R

and matrix multiplication, he is using material before he has defined it, even before he has motivated, explained, exemplified, indicated the value of, and defined it. In good math writing and in good technical writing more generally, that practice is, in non-technical language, a bummer.

But from the table of contents, it appears that the book has quite a long list of possibly interesting narrow topics. And maybe for the routine material, his proofs and presentation are good -- maybe. I thought enough of the book to keep a copy of the PDF. It's there; if someday I want a discussion of some narrow topic, maybe I'll try his book!

In mathematical writing, it used to be common for the word processing to be much more work than the mathematics! Now with TeX and LaTeX, and I'm assuming that the book used one of these two, the flood gates are open!

[+] merlinsbrain|6 years ago|reply
I love that they have problems you can solve as well at the end of (almost) every chapter.

This IS a lot of math (1,962 pages) and it’s missing a preface/introduction which would have been helpful to understand if I need to go linear or if a la carte is okay. At the moment I’d assume each major section is independent.

Awesome find! Wonder how It’s used. (One of) the author(s) seems pretty prolific too - http://www.cis.upenn.edu/~jean/

[+] piggybox|6 years ago|reply
Every time I came across a new book on HN, I feel more needs to be done in my rest of life :)
[+] Koshkin|6 years ago|reply
> a la carte

Yeah, I wish we had an online resource (other than Wikipedia) anyone could learn any sort of math from in a systematic way... Oh well.

[+] meuk|6 years ago|reply
Whoah, this covers a lot. I was expecting some linear algebra, calculus, and discrete math, but there's actually some stuff in there I don't know after doing a masters in math.
[+] djaychela|6 years ago|reply
That makes me feel somewhat better - I saw the title of 'Maths Basics' and thought 'Great!'... then I saw it's 1,962 pages - if that's the basics, how much is the intermediate and advanced bit?
[+] raxxorrax|6 years ago|reply
Thought the same. 2000 pages - Basics was probably an understatement...

Just looked at a few pages and it seems really illustrative. I am just a light-weight mathematician as a computer scientist, but I really would have liked such a comprehensive script for studying. I hate it when profs reduce everything to minimal definitions and expect studends to make sense of it. There are countless books but it is always a gamble that they focus on the topic at hand and don't suffer from the same problems.

This even gives you "motivational examples" which are extremely helpful for comprehension in my opinion.

[+] abhisuri97|6 years ago|reply
I love professor gallier! He's an incredible person.

That being said, this is faaaaaar beyond basics. It'd be more appropriate to call this an incomplete (aiming to be comprehensive) guide to almost everything you need to know in computer science (related to math).

[+] markus_zhang|6 years ago|reply
That's almost 2,000 pages of math...I don't know why and how, but somehow I forgot most of the Statistics knowledge I obtained as a graduate student (in Stat) 10 years ago.

I remembered that I took an advanced course about Bayesian Inference, and one course about Multivariate Statistics (PCA, Factor analysis, these kind of things), and my project is about Bernstein Polynomial. That's it...

[+] xenihn|6 years ago|reply
You forget complex things that you don't use regularly. Math, spoken languages, written languages, coding...of course you can re-learn it, and re-learning is faster than learning it for the first time.

Based on speaking to my managers in the past, it seems like a year-long lapse is enough for you to lose an incredible amount of retained knowledge/skill. But it's not a permanent loss.

[+] emmanueloga_|6 years ago|reply
In basic calculus one can burn countless hours memorizing mechanical rules to derive and integrate different function forms, or one can just plug the function into something like wolfram-alpha and get, for a lot of useful cases, a symbolic answer, or at least some approximate answer for a point or interval.

The point is, understanding integrals and derivatives doesn't require one to memorize all the mechanical rules. Using software to compute those functions can be a huge time saver. No one should go with pen an paper double checking if that polynomial integral is correct or not!

With a book almost 2000 pages long, I wonder if this books leans more heavily on the mechanical-rules side of math. In my mind, is the difference between writing a book such that you can write your own wolfram alpha, or writing a book so you can just use it.

[+] MAXPOOL|6 years ago|reply
You don't need to memorize rules when studying math. Just like you don't need to spend any time to memorize syntax for programming languages. You automatically remember things you use a lot.

Once you have spent countless hours doing exercises to the extent that you understand the math, you already remember the rules. If you have not spent countless hours doing exercises, you don't understand anything at this level.

You don't hire a programmer who has read all the books and 'understands' programming but has never programmed. It's the same with math. You don't just read a math book from start to finish. You can use wolfram alpha for visualizing functions, not for learning math.

[+] commandlinefan|6 years ago|reply
> No one should go with pen an paper double checking if that polynomial integral is correct or not!

Hm - maybe I'm fortunate that I studied calculus before there were (accessible) software packages that could just do this stuff for you, because back then, the only way to solve these was to do them on paper. I'm sure I would have been tempted to just "skip ahead" to letting the computer do it for me, but I definitely learned a lot more going through all of the steps myself than I would have if I had just gotten a high-level understanding of what was going on and plugged the rest into a computer. Because, honestly, integrating polynomials is really, really easy - if you know how to do it, you can do it on paper faster than you can load up wolfram-alpha, type it in, and wait for an answer.

[+] xiaolingxiao|6 years ago|reply
This one does not. it's proof heavy, and there isn't really a mechanical way to do proofs that's efficient. The understanding comes first and often all in bunches when "the light turns on," then the proof follows.
[+] krosaen|6 years ago|reply
Haha 1900 pages on "basics"?

I suspect there are better resources for each topic covered (e.g Gilbert Strang books and OCW lectures for Linear Algebra), but it is definitely interesting to peruse and get a sense of relevant topics.

[+] jvehent|6 years ago|reply
"Math Basics"

It's 2000 pages long....

[+] j7ake|6 years ago|reply
Even if you were ambitious and manage to read 5 pages per day everyday this would take you more than one year to read this from start to finish.
[+] TimMurnaghan|6 years ago|reply
Nice to see wavelets in here - but it's a shame that he seems to be encouraging people to actually use Haar wavelets. They're fine for teaching - but there are usually better choices in real life. Daubechies are a good default
[+] floki999|6 years ago|reply
The writing style of this book i.e. rigorous math notation and proposition/proof presentation is going to put off the great majority of potential CS and ML readers. At almost 2000 pages it sure makes a great door-stop though.
[+] impaktdevices|6 years ago|reply
Me: Oh, good! I've always been pretty good at math but I want to learn how to make sense of the math I encounter in CS and ML.

[Reads the first paragraph of the 2nd chapter]

Me: I don't know anything about math. At all.

[+] amthewiz|6 years ago|reply
Basics should be concepts that get you to 80% and tell you where to look for the rest 20%. This book tries to get you directly to 95% and is best treated as a reference book.
[+] laichzeit0|6 years ago|reply
Strangely, probability theory is completely omitted.
[+] ps101|6 years ago|reply
I really can't figure out who the target audience for this book is, if it has a target audience at all.