Since we're on the topic of tutorials to understand neural nets and modern deep learning, I will throw in Michael Nielsen's excellently written free online "book" on neural nets. It's really a set of 6 long posts that gets you from 0 to understanding all of the fundamentals with almost no prerequisite math needed.
Using clear and easy to understand language, Michael explains neural nets, the backprop algorithm, challenges in training these models, some commonly used modern building blocks and more:
This book opened my eyes to the power of textbooks written in such easy to understand, clear style. Bet it took repeated revisions, incorporating feedback from others and hours of work but such writing is a huge value add to the world.
Great post that I often go back on. A curious fact about Karpathy is that he actually has a long history of teaching (relative to his age). About 9 years ago, I learned how to speed solve Rubik's cubes in ~12 seconds through his YouTube channel [0]. It's interesting to see that his simple teaching style transfers quite well to more technical topics than twisty puzzles.
I think this eventually turned into Andrej Karpathy's class at Stanford, CS231n. The class notes are here:
http://cs231n.github.io/
The class is on youtube.
If you like this hacker's guide, I think you'll definitely like the class and the notes.
edit:
A lot of the compute graph and backprop type stuff that is in the hacker's guide is covered in this specific class, starting about at this time: https://www.youtube.com/watch?v=i94OvYb6noo&t=207s
There's a lot of high-school math there, but the trouble is that the real workings of neural networks (the speed of convergence, and why/if it works on samples outside the training/validation set) are left a mystery, if you ask me.
It is relatively clear why it works beyond the training and validation set: What is being approximated is a smooth function, which in the case of a classification task is a function from the space of things to be classified (images of a certain size) to the n-simplex, where n is the number of classes. Then the preimage theorem tells you that over a regular point of this smooth map lies a codimension n submanifold in the space of things to be classified. That in turn can be interpreted as the submanifold of all things that look like the class you are assigning, especially close to the corners of the n-simplex (being a regular point is an open condition). In short: Because the map is constructed to be smooth it will make sense beyond whatever the training / validation data was. Note that this does not guarantee that it has learned something reasonable about the dataset, just that it will have found some way to smoothly separate it into different components.
General popular opinion seems to be that these are (to greater or lesser extent) a mystery for everyone. Can you suggest any intermediate reading on things like generalisation? I've looked online but only found either "here's how to recognize numbers in the MNIST dataset using numpy" or "First we take <long string of squiggles> which trivially implies <longer string of squiggles>..."
Uses updated Keras and python and doesn’t go so much into the network connections itself.
I do regular trainings and teach seminars on neural networks and find most tutorials online go too in-depth regarding constructing a network from scratch (such as this one) - they lose people.
The biggest issue is actually data formatting, ingestion then hyperparameter tuning today. You really only need to grasp the basics to get started in 2019.
[+] [-] theCricketer|7 years ago|reply
Using clear and easy to understand language, Michael explains neural nets, the backprop algorithm, challenges in training these models, some commonly used modern building blocks and more:
http://neuralnetworksanddeeplearning.com/
This book opened my eyes to the power of textbooks written in such easy to understand, clear style. Bet it took repeated revisions, incorporating feedback from others and hours of work but such writing is a huge value add to the world.
[+] [-] f00_|7 years ago|reply
[+] [-] jeraguilon|7 years ago|reply
[0] https://www.youtube.com/user/badmephisto
[+] [-] Rainymood|7 years ago|reply
Badmephisto == Andrej Karpathy?!
I would've never made the connection ... badmephisto also got me into speedcubing, my pb is ~14 sec, crazy ...
[+] [-] freediver|7 years ago|reply
Not as educational but funny to see him getting owned in WoW.
https://www.youtube.com/watch?v=2b-F8QqHFaM
[+] [-] yorwba|7 years ago|reply
Previous submissions:
https://news.ycombinator.com/item?id=14769525
https://news.ycombinator.com/item?id=9249924
https://news.ycombinator.com/item?id=8553307
[+] [-] freediver|7 years ago|reply
[+] [-] jaimex2|7 years ago|reply
[+] [-] sdan|7 years ago|reply
[+] [-] otaviogood|7 years ago|reply
[+] [-] peteretep|7 years ago|reply
[+] [-] gnulinux|7 years ago|reply
[+] [-] asnyc|7 years ago|reply
[+] [-] amelius|7 years ago|reply
[+] [-] orbifold|7 years ago|reply
[+] [-] taneq|7 years ago|reply
[+] [-] unknown|7 years ago|reply
[deleted]
[+] [-] yumraj|7 years ago|reply
[+] [-] buzzier|7 years ago|reply
[+] [-] lettergram|7 years ago|reply
https://austingwalters.com/neural-networks-to-production-fro...
Uses updated Keras and python and doesn’t go so much into the network connections itself.
I do regular trainings and teach seminars on neural networks and find most tutorials online go too in-depth regarding constructing a network from scratch (such as this one) - they lose people.
The biggest issue is actually data formatting, ingestion then hyperparameter tuning today. You really only need to grasp the basics to get started in 2019.