Ask HN: What are the foundational texts for learning about AI/ML/NN?

"Introduction to Statistical Learning" - https://www.statlearning.com/

(there's also "Elements of Statistical Learning" which is a more advanced version)

AI: A Modern Approach - https://aima.cs.berkeley.edu/

rg111|3 years ago

ISL is a legit good book. Has the correct amount and balance or rigor and application.

The explanation, examples, projects, math- all are crisp.

As the name suggests, it is only an introduction (unlike CLRS). And it does serve as a great beginners' book giving you proper foundation for the things that you learn and apply in the future.

One thing people complain about is it being written in R, but no serious hacker should fear R, as it can be picked up in 30 minutes, and you can implement the ideas in Python.

As someone with industry experience in Deep Learning, I will recommend this book.

The ML course by Andrew Ng has no parallel, though. One must try and do that course. Not sure about the current iteration, but the classic one (w/ Octabe/MATLAB) was really great.

bjornsing|3 years ago

The Elements of Statistical Learning, by Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie. I’ve seen it referenced quite a few times and the TOC looks good.

kevinskii|3 years ago

I agree. I read the first edition to Intro to Statistical Learning and it went into just the right level of mathematical depth. The authors also have Youtube lectures that accompany the chapters, and these are a great reinforcement of the material.

ranc1d|3 years ago

Nice I didn't realise they released a 2nd edition this book and also new website too! Thanks for sharing

KRAKRISMOTT|3 years ago

Haugeland is GOFAI/cognitive science, not directly relevant to modern machine learning variety of models unless you are doing reinforcement learning or trees stuff (hey poker/chess/Go bots are pretty cool!). Russel and Norvig are the typical introductory textbooks for those. Marks and Haykins are all severely out of date (they have solid content, but they don't have the same scale of modern deep learning which has many emergent properties).

You are approaching this like an established natural sciences field where old classics = good. This is not true for ML. ML is developing and evolving quickly.

I suggest taking a look at Kevin Murphy's series for the foundational knowledge. Sutton and Barto for reinforcement learning. Mackay's learning algorithms and information theory book is also excellent.

Kochenderfer's ML series is also excellent if you like control theory and cybernetics

https://algorithmsbook.com/ https://mitpress.mit.edu/9780262039420/algorithms-for-optimi... https://mitpress.mit.edu/9780262029254/decision-making-under...

For applied deep learning texts beyond the basics, I recommend picking up some books/review papers on LLMs, Transformers, GANs. For classic NLP, Jurafsky is the go-to.

Seminal deep learning papers: https://github.com/anubhavshrimal/Machine-Learning-Research-...

Data engineering/science: https://github.com/eugeneyan/applied-ml

For speculation: https://en.m.wikipedia.org/wiki/Possible_Minds

mtlmtlmtlmtl|3 years ago

A quick point about the "tree stuff" and Norvig&Russell:

While it does cover minimax trees, alphabeta etc, it only really provides a very brief overview. The book is more of an overview of the AI/ML fields as a whole. Game playing AI is dense with various game-specific heuristics that the book scarcely mentions.

Not sure about books, but the best resource I've found on at least chess AI is chessprogramming.org, then just ingesting the papers from the field.

ipnon|3 years ago

To your second point I have a sneaking suspicion whatever is recommended in this very thread will suddenly jump in its estimation as a “classic.” History is made up as it goes along!

starwind|3 years ago

Does the order matter for Kochenderfer? Any one of those put more emphasis on controls than the others?

mfrieswyk|3 years ago

Appreciate the comment very much. I feel like I need to build a foundation context in order to appreciate the significance of the latest developments, but I agree that most of what I posted doesn't represent the state of the art.

TaupeRanger|3 years ago

There are none anymore. We now know that throwing a bunch of bits into the linear algebra meat grinder gets you endless high quality art and decent linguistic functionality. The architecture of these systems takes maybe a week to deeply understand, or maybe a month for a beginner. That's really it. Everything else is obsolete or no longer applicable unless you're interested in theoretical research on alternatives to the current paradigm.

rg111|3 years ago

You are plain exaggerating. You can't do all of them in a few weeks. Algorithms: Lin Reg -> Log Reg -> NN -> CNN + RNN -> GANs + Transformers -> ViT -> Multimodal AI + LLMs + Diffusion + Auto Encoders

    SVM, PCA, kNN, k-means clustering, etc.

    LightGBM, XGboost, Catboost, etc.

    Optimization and optimizers.

    Application-wise:
    Classification, Semantic Segmentation, Pose Estimation, Text Generation, Summarization, NER, Image Generation, Captioning, Sequence Generation (like music/speech), text to speech, speech to text, recommender systems, sentiment amalysis, tabular data, etc.

    Frameworks:
    pandas, sklearn, PyTorch, Jax -> training  inference, data loading

    Platforms:
    AWS + GCP + Azure
    And a lot of GPU shenanigans + framework/platform specific quirks

All these will take you ~2 years or 1.5 years at least,

given that:

- you already know Python/any programming language properly

- you already know college level math (many people say you don't need it, but haven't met a single soul in ML research/modelling without college level math)

- you know Stats 101 matching a good uni curriculum and ability to learn beyond

- you know git, docker, cli, etc.

Every influencer and their mother promising to teach you Data Science in 30 days are plain lying.

Edit: I see that I left out Deep RL. Let's keep it that way for now.

Edit2: Added tree based methods. These are very important. XGBoost outperforms NNs every time on tabular data. I also once used an RF head appended to a DNN, for final prediction. Added optimizers.

sillysaurusx|3 years ago

A month to deeply understand?

I've been doing it since early 2019 and there are still subtleties that catch me off guard. Get back to me when you're not surprised that you can get rid of biases from many layers without harming training.

I broadly agree with you, but the timeline was just a little too aggressive. By about 10x. :)

jtmcmc|3 years ago

This is definitely a take that ignores the massive amount of utility for ML that exists outside of generative images and NLP on the one hand and on the other vastly misrepresents the time it takes to understand a model, assuming one does not already have a background in CS, linear algebra and in particular matrix calculus, probability, stats, etc...

cyber_kinetist|3 years ago

You still need to understand some basic theory/math about probabilistic inference (along with some knowledge of linear algebra), or else you’ll get a bit overwhelmed by some of the equations and not understand what the papers are talking about. PRML by Bishop is probably more than enough to start reading ML papers comfortably though. (This would probably be too easy for a competent math major, but not all of us are trained that way from the beginning…)

moneywoes|3 years ago

What resources are there to understand in a month?

raz32dust|3 years ago

I personally consider Linear algebra to be foundational in AI/ML. Intro to Linear algebra, Gilbert Strang. And his free course on MIT OCW is fantastic too.

While having strong mathematical foundation is useful, I think developing intuition is even more important. For this, I recommend Andrew Ng's coursera courses first before you dive too deep.

mindcrime|3 years ago

Another interesting resource for Linear Algebra is the "Coding the Matrix" course.

http://codingthematrix.com/

https://www.youtube.com/playlist?list=PLEhMEyM9jSinRHXJgRCOL...

viscanti|3 years ago

Strang is great but he covers a lot of things that don't have much carryover to AI/ML and doesn't really cover things like Jacobians which do. Maybe there's something more useful for someone who is only learning Calculus and Linear Algebra for AI/ML than what Strang teaches.

nephanth|3 years ago

Linear algebra, and differential calculus (needs linear algebra), and a bit of optimisation (at least get an understanding of sgd)

Also proba/statistics! Without those you can end up doing stuff pretty wrong

mfrieswyk|3 years ago

I never took beyond Precalculus in school, thanks for the tip!

crosen99|3 years ago

"Neural Networks and Deep Learning", by Michael Nielsen http://neuralnetworksanddeeplearning.com (full text)

The first chapter walks through a neural network that recognizes handwritten digits implemented in a little over 70 lines of Python and leaves you with a very satisfying basic understanding of how neural networks operate and how they are trained.

martythemaniak|3 years ago

This is the thing that made NNs "click" for me, I think it was very good. Before this I did Andrew Ng's old ML course on coursera, so I thought that was a good intro to old ML approaches, common terms/techniques and flowed nicely into NNs.

But there's are both kinda old now, so there must be something newer that'll give you an equally good intro to transformers, etc.

nmfisher|3 years ago

+1 for this, when I was coming in as a complete newb to neural networks, this was the clearest and most accessible material I found.

conjectureproof|3 years ago

+1 on Elements of Statistical Learning.

Here is how I used that book, starting with a solid foundation in linear algebra and calculus.

Learn statistics before moving on to more complex models (neural networks).

Start by learning ols and logistic regression, cold. Cold means you can implement these models from scratch using only numpy ("I do not understand what I cannot build"). Then try to understand regularization (lasso, ridge, elasticnet), where you will learn about the bias/variance tradeoff, cross-validation and feature selection. These topics are explained well in ESL.

For ols and logistic regression I found it helpful to strike a 50-50 balance between theory (derivations and problems) and practice (coding). For later topics (regularization etc) I found it helpful to tilt towards practice (20/80).

If some part of ESL is unclear, consult the statsmodels source code and docs (top preference) or scikit (second preference, I believe it has rather more boilerplate... "mixin" classes etc). Approach the code with curiosity. Ask questions like "why do they use np.linalg.pinv instead of np.linalg.inv?"

Spend a day or five really understanding covariance matrices and the singular value decomposition (and therefore PCA which will give you a good foundation for other more complicated dimension reduction techniques).

With that foundation, the best way to learn about neural architectures is to code them from scratch. Start with simpler models and work from there. People much smarter than me have illustrated how that can go: https://gist.github.com/karpathy/d4dee566867f8291f086 https://nlp.seas.harvard.edu/2018/04/03/attention.html

While not an AI expert, I feel this path has left me reasonably prepared to understand new developments in AI and to separate hype from reality (which was my principal objective). In certain cases I am even able to identify new developments that are useful in practical applications I actually encounter (mostly using better text embeddings).

Good luck. This is a really fun field to explore!

poulsbohemian|3 years ago

I'm sitting ten feet from my copy of Artificial Intelligence, a modern approach – Stuart Russell, Peter Norvig. While I will say it has a lot of still worthwhile basic information, I really wouldn't recommend it. It's an enormous book, so physically difficult to read, but also the bulk of the content is somewhere between dated and terse. I went through school and studied AI ten years before it was written, and I'm glad I didn't use it as an undergrad textbook - would have been overwhelming.

One of the problem with AI is exactly what you noted above - there are a lot of subcategories and my gut tells me these will grow. For the real neophyte, I'd say start with something that interests you or that you need for work - you likely aren't going to digest all of this in a month and probably no single book will meet all your needs.

bradreaves2|3 years ago

This is off the beaten path, but consider Abu-Mostafa et al.'s "Learning from Data". https://www.amazon.com/Learning-Data-Yaser-S-Abu-Mostafa/dp/...

I adore PRML, but the scope and depth is overwhelming. LfD encapsulates a number of really core principles in a simple text. The companion course is outstanding and available on EdX.

The tradeoff is that LfD doesn't cover a lot of breath in terms of looking at specific algorithms, but your other texts will do a better job there.

My second recommendation is to read the documentation for Scikit.Learn. It's amazingly instructive and a practical guide to doing ML in practice.

vowelless|3 years ago

I strongly second this. Abu Mostafa has videos and homework for this course too. This course was the one that made a LOT of fundamental things “click”, like, why does learning even work and what are some broad expectations about what we can and cannot learn.

PartiallyTyped|3 years ago

LfD is a great book to get people to think about complexity classes and model families. We used that in my grad course and I can recommend it.

bjornsing|3 years ago

It’s probably a bit off the beaten path, but I can highly recommend Probability Theory, The Logic of Science, by E. T. Jaynes.

In the opening chapter Jaynes describes a hypothetical system he calls “The Robot”. He then lays out the mathematics of the “The Robot’s” thinking in detail: essentially Bayesian probability theory. This is the best summary of an ideal ML/AI system I’ve come across. It’s also very philosophically enlightening.

sillysaurusx|3 years ago

I'm so sad the editor chose not to publish Jaynes' C snippets because "they were too cryptic." They would've helped clarify the ideas greatly.

It's a good book, but I don't know how it's related to ML. My own answer would be "Just do it." Find an ML project you like and start tinkering around. But everyone learns differently, so maybe there's a book that can replace experience.

misiti3780|3 years ago

seconded! it's a great book.

gerash|3 years ago

I'd suggest these two by Kevin Murphy:

Probabilistic Machine Learning: An Introduction

https://probml.github.io/pml-book/book1.html

Probabilistic Machine Learning: Advanced Topics

https://probml.github.io/pml-book/book2.html

pablo24602|3 years ago

Working through these right now- definitely recommend them

junkerm|3 years ago

In read parts of Murphys "Probabilistic Maschine Laearning" (vol 1) which is an update of an existing book in ML. It covers a broad range of topics also very recent developments. It also includes foundation topics such as probability, linear algebra, optimization. Also it is quite aligned with the Goodfellow book. I found it quite challenging at certain points. What helped a lot was to read a book on bayesian statistics. I used Think Bayes by Allen Downey for that (http://allendowney.github.io/ThinkBayes2/index.html)

daturkel|3 years ago

I maintain a list of well-known or foundational papers in ML in a github repo that may be of interest to readers of this thread

https://github.com/daturkel/learning-papers

digitalsushi|3 years ago

Are there obvious paths into these spaces for someone stuck over in devops/infrastructure/platform engineering? Or is it too far a hop to really find a direct path in?

Let me ask a slightly different way - can someone like me get into a job like these, without needing some more college?

My day job is wrapping up OS templates for people with ML software and I always wonder what they get to go do with them once they turn into a compute instance.

throwaway81523|3 years ago

> Let me ask a slightly different way - can someone like me get into a job like these, without needing some more college?

It is a trendy area and in such areas there is always skepticism towards wannabe entrants. As for whether you know enough math, I would start by watching the fast.ai videos and seeing if you're comfortable with the explanations and tools.

I can say I have a stronger math background than most programmers (though less strong than that of real math geeks) and I don't think I know enough math to really grok this stuff, but I'm always after a more foundational understanding than it takes to just use the tools. I think there are opportunities that don't require the math, but are just about having gotten some practice with packages X, Y, or Z. In the end though, those are like web frameworks that become obsolete all the time. So it is worth spending time on foundations.

zmgsabst|3 years ago

Why not ask them?

Call it cross functional training to increase your domain knowledge, tell your manager you need it to ensure you’re providing the best service possible, and get your coworkers to help you learn the framework they use…?

jtmcmc|3 years ago

if you're already doing a job at a company that does this stuff, can you talk to people about wanting to change teams and learn?

friendlyHornet|3 years ago

I would like to know this, as well.

ly3xqhl8g9|3 years ago

Not sure if foundational (quite a tall order in such a fast-moving field), but for sure a nice introduction into neural networks, and even mathematics in general (for a teenager, because it's nice to see numbers in action beyond school-level algebra):

→ Harrison Kinsley, Daniel Kukiela, Neural Networks from Scratch, https://nnfs.io, https://www.youtube.com/watch?v=Wo5dMEP_BbI&list=PLQVvvaa0Qu...

Somewhat foundational, if not in actuality, then in the intention to actually build a theory as in theory of gravitation, although not necessarily an introductory text:

→ Daniel A. Roberts, Sho Yaida, The Principles of Deep Learning Theory, https://arxiv.org/abs/2106.10165

avipeltz|3 years ago

- AIMA by Russel and Norvig is a classic but I would say is more of overview of the field and for most topic areas isn't quite deep enough imo.

- For deep learning specifically, a more applied text that is beautifully written and chock full of examples is Francois Chollet's Deep Learning with Python (there a new second edition out with up to date examples using modern versions of Tensorflow). The first 3 chapters I would give as required reading for anyone interested in understanding some deep learning fundamentals.

- Deep Learning - goodfellow and bengio - seems like it would be hard to get through without a reading group not exactly a APUE or K&R type reading experience but I haven't spent enough time with it.

If you haven't taken a Linear Algebra or Differential Equations class its useful stuff to know for ML/DL theory but not fully necessary to do applied work with modern high level libraries, but definitely having a strong understanding of basic matrix math is useful.

If you have interests in natural language processing theres a couple good books:

- Natural Language Processing with Python - Bird Klein, Loper, is a great intro to NLP concepts and working with NLTK which may be a bit dated to some but I would definitely recommend, and its online for free. Great examples.(https://www.nltk.org/book/)

- Speech and Language Processing - Dan Jurafsky and James H. Martin - is good, though I have only spent much time with the pre-print

And then theres a lot of papers that are good reads. Let me know if you have any questions or want a list of good papers.

If you just want to get off the ground and start playing with stuff and building things I'd recommend fast.ai's free online course - its pretty high level and a lot is abstracted away but its a great start and can enable you to build lots of cool things pretty rapidly. Andrew Ng's online course also is quite requitable and will probably give you a bit more background and fundamentals.

If I were to choose one book from the bunch it would be Chollet it gives you pretty much all the building blocks you need to be able to read some papers and try to implement things yourself and I find building things a much more satisfying way to learn than sitting down and writing proofs or just taking notes but thats just my preference.

rg111|3 years ago

Norvig-Russel has many chapters spanning hundreds of pages that are way out of date and not used anywhere.

And the new things he cover are covered in a better manner and better depth in other sources.

I read this book like a novel. Good for a basic overview, but the RoI is very low.

dezzeus|3 years ago

You may want to also consider this one:

Artificial Intelligence, a modern approach – Stuart Russell, Peter Norvig

mindcrime|3 years ago

Can't recommend this highly enough, if for no other reason than to provide some context to help the OP from getting trapped in the "deep learning is all you need" echo-chamber. Sure ANN's and DL are great and do amazing things, but until it's proven that they really are the "be all, end all" (something I suspect we're far from) then it makes sense to dedicate at least some cycles to considering other paradigms.

apu|3 years ago

The big book of stuff that doesn't work.

gaspb|3 years ago

If you're more inclined to theory, I would suggest "Learning Theory from First Principles" by F. Bach: https://www.di.ens.fr/~fbach/ltfp_book.pdf

The book assumes limited knowledge (similar to what is required for Pattern Recognition I would say) and gives a good intuition on foundational principles of machine learning (bias/variance tradeoff) before delving to more recent research problems. Part I is great if you simply want to know what are the core tenets of learning theory!

sinenomine|3 years ago

Ironic, since the relatively recently discovered double descent makes it clear that bias-variance tradeoff as we know it from statistical learning theory simply doesn't apply to "overparameterized" deep models.

Much of old theory is barely applicable and people are, understandably, bewildered and in denial.

If someone were to be inclined to theory, I'd just recommend reading papers that don't try oversimplify the domain:

https://arxiv.org/abs/2006.15191

https://arxiv.org/abs/2210.10749

https://arxiv.org/abs/2205.10343

https://arxiv.org/abs/2105.04026

stevenbedrick|3 years ago

To add to the great recommendations on this thread, I really like Moritz Hardt and Benjamin Recht's "Patterns, Predictions, and Actions". It's published by Princeton University Press here: https://press.princeton.edu/books/hardcover/9780691233734/pa...

But is also available online as a preprint here: https://mlstory.org/

master_yoda_1|3 years ago

This book is all you need https://probml.github.io/pml-book/book1.html

rg111|3 years ago

Do you have Linear Algebra knowledge, and Stats 101 knowledge?

Then start with ISLR.

Then go and watch Andrew Ng Machine Learning course on Coursera (a new version was added in 2022 that uses Python).

Then read the sklearn book from its maintainers/core devs. It's from O'Reilly.

Then go do the Deep Learning Specialization from deeplearning.ai.

Then do fast.ai course.

If interested in Deep RL, watch David Silver lectures, then read Deep RL in Action by Zai, Brown. Then do the HF course on Deep RL.

This is how you get started. Choose your books based on your personality, needs, and contents covered.

And among MOOCs, I highly suggest the one by Canziani, LeCun from NYU. (I loved the 2020 version.)

The one taught by Fei Fei Li and Andrej Karpathy is nice.

These two MOOCs can substitute classic books based on quality.

I have never read cover to cover any of the famous books. I read a lot from them sticking to specific subjects.

Get to reading papers, finding implementations. Ng + ISLR will give you good grounds. Fast.ai + deeplearning.ai will give you capability to solve real problems. NYU + Tubingen + Stanford + UMich (Justin Johnson) courses will bring you to the edge.

You need a lot of practical experience that aren’t taught anywhere. So, get your hands dirty early. Learn to use frameworks, cloud platforms, etc.

Then start reading papers.

A crystal clear grasp on Math foundations is a must. Get it if you don't have already.

pkoird|3 years ago

AIMA by Russel and Norvig is a must read IMO.

ipnon|3 years ago

I’d posit we don’t understand AIML enough to know their foundations with much certainty. Take for example the discovery of emergent zero-shot properties in the latest LLMs. My recommendation to a beginner would be to grok gradient descent, matrix multiplication, and the universal approximation theorem, then get on to engineering like the rest of us. You can’t go wrong with Jeremy Howard’s FastAI course and his “Deep Learning for Coders.”

IanCal|3 years ago

I think a good start is to think about what you want to do. "Back in my day" ai was mostly academic and had more classic foundational parts with newer flashy bits. It wasn't, broadly, applicable to the real world. Some parts but not a huge amount.

Now I think you've got key parts. There's how to use recent production ready models/systems, how to train them and how to make them. Is it in a research or business context?

The field is also broad enough that any one section (text, images, probably symbols) and subsection (time series, bulk, fast online work) all have significant bodies of work behind them. My splits here will not be the best currently so I'm happy for any corrections on a useful hierarchy by the way.

Perhaps you're interested in the history and what's led up to today's work? That's more of a "brief history of time" style coverage, but illuminating.

I'm aware I've not helpfully answered, but I think the same question could have very different valid goals and wanted to bring that to the fore.

robg|3 years ago

Coming from cognitive neuroscience surprised that Explorations in Parallel Distributed Processing by McClelland and Rumelhart doesn’t get more attention as a classic in bridging old school AI approaches with the modern paradigm.

https://psycnet.apa.org/record/1988-97441-000

rramadass|3 years ago

This is nice; I am more interested in first understanding the origins/concepts/ideas behind AI/ML than in all the complicated mechanisms involved in implementing them (i.e. the simplest possible explanation/implementation) and hence these sort of books really interest me.

Any more recommendations?

PS: You might find Vehicles: Experiments in Synthetic Psychology by Valentino Braitenberg interesting if you don't already know of it.

throwaway81523|3 years ago

Blum, Hopcroft, and Kannan, Foundations of Data Science looks good:

https://www.cs.cornell.edu/jeh/book%20no%20so;utions%20March...

Also in published form from Cambridge University Press:

https://www.cambridge.org/core/books/foundations-of-data-sci...

jgrimm|3 years ago

Learning From Data (https://amlbook.com) is a great introduction to ML from a more theoretical perspective. The language is easy to understand but the concepts that it deals with are very theoretical, a combination that is hard to find elsewhere.

For example nearly everyone understands how to apply multivariable logistic regression, in say Numpy, however a good grasp of underlying concepts such as confidence bounds for overfitting and and being able to use formal proofs to explain concepts such as VC Generalisation will both help you stand out and provide a good foundation that makes further learning much easier.

cscurmudgeon|3 years ago

Get a strong grasp on Linear Algebra and everything else falls into place more easily

https://math.mit.edu/~gs/learningfromdata/

5cott0|3 years ago

https://www.manning.com/books/deep-learning-with-python-seco...

adg001|3 years ago

I have not seen mentioned so far in this thread the following book, which I can't recommend more highly:

Understanding Machine Learning: From Theory To Algorithms – Shai Shalev-Shwartz

dmarcos|3 years ago

I remember Carmack mentioning in a podcast a list of seminal papers that Ilya Sutskever (@ilyasut) gave to him to learn AI foundations. I would love to see that list.

zffr|3 years ago

You may also want to consider reading through some of the important (or highly cited) academic papers in AI/ML/NN. From these papers you may get a sense of the techniques researchers are using, and which topics are most important to learn.

I have not applied this technique to AI/ML/NN specifically, but it has been useful for me when trying to learn other topics.

epgui|3 years ago

The foundations of AI/ML are really linear algebra and statistics. But not the kinds of stats most people learn in undergrad: focus on linear models (there are tons of great books on just that; also look up “common statistical tests are linear models” for a great intro into what i’d call useful stats), bayesian stats, anova/manova/permanova, etc.

dceddia|3 years ago

I’m a big fan of learning through practice vs learning all the theory up front, and for anyone else who feels the same, the Fast AI course and book are very good: https://fast.ai

The authors are working on a new course that’ll dive deep into the modern Stable Diffusion stuff too, which I’m looking forward to.

cttet|3 years ago

I will recommend Information Theory, Inference, and Learning Algorithms by David MacKay. If you really want to understand the "learning" part, rather than being given a methodology without knowing why or in the unsorted bounds that "guarantee" abstract things that may mismatch reality.

alphabetting|3 years ago

For a less technical history of the field and major players I'd recommend Genius Makers.

6gvONxR4sf7o|3 years ago

Kevin Murphy’s books (especially the new ones) are what I’d point anyone towards for ML.

davidhunter|3 years ago

The Quest for Artificial Intelligence: A History of Ideas and Achievements Nils J. Nilsson

This is a good overview of the history of the field (up to SVMs and before deep NNs). I found this useful for putting all the different approaches into context.

bilsbie|3 years ago

If anyone is just starting and out wanting to do a study group let me know.

I’m having trouble keeping my motivation up but I really want to get up to speed on how LLM’s work and someday make a career switch.

moneywoes|3 years ago

Im down

PartiallyTyped|3 years ago

I recommend against DL by Goodfellow. At this point it is pretty much outdated. Actually, anything specific to NNs is already outdated by release.

You'd need the following background:

- Linear Algebra

- Multivariate Calculus

- Probability theory && Statistics

Then you need a decent ML book to get the foundations of ML, you can't go wrong with either of these:

- Bishop's Pattern Recognition

- Murphy's Probabilistic ML

- Elements of statistical learning

- Learning from data

You can supplement Murphy's with the advanced book. Elements is a pretty tough book, consider going through "Introduction to statistical learning"[1]. Bishop and Murphy include foundational topics in mathematics.

LfD is a great introductory book and covers one of the most important aspects of ML, that is, model complexity and families of models. It can be supplemented with any of the other books.

I'd also recommend doing some abstract algebra, but it's not a prerequisite.

If you would like a top-down approach, I recommend getting the book "Mathematics of Machine Learning" and learning as needed.

For NN methods, some recommendations:

- https://paperswithcode.com/methods/category/regularization

- https://paperswithcode.com/methods/category/stochastic-optim...

- https://paperswithcode.com/methods/category/attention-mechan...

- https://paperswithcode.com/paper/auto-encoding-variational-b...

For something a little bit different but worth reading given that you have the prerequisite mathematical maturity

- Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges | https://arxiv.org/abs/2104.13478

[1] https://www.statlearning.com/

Many thanks to the user "mindcrime" for catching my error with Introduction to statistical learning.

mindcrime|3 years ago

consider going through "Introductions to Elements of statistical learning"

Was that supposed to be An Introduction to Statistical Learning[1] or maybe Introduction to Statistical Relational Learning[2]? I don't think there is a book titled Introduction to Elements of Statistical Learning?

[1]: https://www.statlearning.com/

[2]: https://www.cs.umd.edu/srl-book/

sillysaurusx|3 years ago

(I can't wait until the myth that you need linear algebra and calculus to do ML finally dies. It's like saying that you need to understand assembly to do programming. It helps, but it's far from a requirement.)

antegamisou|3 years ago

As always on HN, the right answer is at the bottom.

nephanth|3 years ago

At some points, UC Berkeley's course videos were available on the web, and they had a pretty good AI course

jpamata|3 years ago

ISLR for foundation and passing interviews. It also has lectures in youtube, just type ISLR lectures.

revskill|3 years ago

Without a running code, it's hard to grasp concepts. So i prefer texts with code.

113 comments