Practical Deep Learning for Coders

[+] OmarIsmail|9 years ago|reply

I've been coding for 20 years and professionally for almost 15 and I started watching the first video and found it to be pretty difficult to follow. I think you make a pretty common mistake that a lot of technical people do, which is take for granted how much "institutional" knowledge you have.

The topics touched on in the first 30 minutes of the video include: AWS, Jupityr Notebooks, Neural Networks, Tmux, and a few others. I understand that this is the reality of the situation today (very large up front cost of setting everything up) but it would be better to not even touch upon something like Tmux because it's not absolutely essential and just results in information overload. You can replace it with something like "I like using Tmux to save my terminal sessions, check out the wiki to learn more" instead of "here's this 3 minute tutorial on Tmux in the middle of how to use Jupyter notebooks". Very few people are smart enough to be concentrating on following what's going on with AWS/Jupyter notebooks and then pause that, process the Tmux stuff, and then context switch back to AWS/Jupyter.

There's a reason why the wiki/forums are so invaluable. There's definitely some really good information in the videos so if you guys had the time I really hope you edit the videos to an "abridged" versions that focus on only 1 topic instead of jumping around so much.

[+] jph00|9 years ago|reply

That's a great point - but I would like to add that the expectation is that people put in at least 10 hours a week. So the amount of material is designed to be what you can learn in 10 hours of study, with the support of the wiki and forums.

The lessons are not designed to stand alone, or to be watched just once straight through.

The details around setting up an effective environment are a key part of the course, not a diversion. We really want to show all the pieces of the applied deep learning puzzle.

In a more standard MOOC setting this would have been many more lessons , each one much shorter - but we felt the upside of having the live experience with people asking questions was worth the compromise.

I hope you stick with it and if you do I'd be interested to here if you still feel the same way at the end. We're doing another course next year so feedback is important. On the whole most our in-person students ended up liking the approach, although most found it more intensive than they expected. (We did regular anonymous surveys. )

[+] dposter|9 years ago|reply

I got so excited when I read your comment. No offense, but there are so many basic intro videos like what you're asking for. However, after that... there's nothing.. I've been looking for something to take me to the next level, and when I read "in the first 30 minutes of the video include: AWS, Jupityr Notebooks, Neural Networks, Tmux" I squeeled with joy. If anyone knows other advanced tutorials on how to design, manage, and scale up their operation into some seriously organized, efficient, and automated machine learning.. I'm dying to find out.

[+] master_yoda_1|9 years ago|reply

You should be thankful that he did not ask you to learn linear algebra or real analysis :)

[+] pknerd|9 years ago|reply

sentdex (Harrison Kinsley) doing great stuff by making video tutorials of ML/DeepLearning. Do check them out.

[+] Simorgh|9 years ago|reply

This is really stunning. I can't wait to commence to course. I finished a Masters from a top-50 worldwide university, and frankly, the approach to data science was mediocre at best. The NLP module notes were plagiarised from Stanford and we were quite happy with this! It gave us a break from 20 year old textbooks that set the plodding pace for the Data Mining module. And don't get me started on my deep learning dissertation. The only expert in the uni on the topic got poached by facebook halfway through the project. The universities are finding it difficult to keep up and are resorting to 'interesting' techniques to retain talent - witness the Turing Institute in the UK. They gave out titles to many professors in several universities a year or so ago... as I gather, as a precursor to pivotal data science research.

[+] laughingman1234|9 years ago|reply

I applied for Masters in ML , in one of these univ ,

CMU, UT Austin, Georgia Tech, UCSD..

I am not in US, thought getting MS in one of these would boost chances of getting into something like google brain, or Open AI..

Is it waste of time and money in your opinion?

[+] jph00|9 years ago|reply

Hi all, Jeremy Howard here. I'm the instructor for this MOOC - feel free to ask any questions here.

[+] kirkules|9 years ago|reply

Just FYI: The AWS console UI is currently different in a number of ways from the interface shown in the "how to set up your aws deep learning server" video, beyond what is accounted for in the overlay-text amendments to the video. (e.g. creating a user has 4 involved steps before you get to see the security credentials, including setting up a user group and access policies for that group -- I made my user have the AdministratorAccess policy...)

[+] dharma1|9 years ago|reply

Looks good!

Recommendations on how to go from basics (being able to fine-tune pretrained ImageNet/Inceptionv3 with new data etc) to a real project? I'd like to play with semantic segmentation of satellite images (hyperspectral). Any pointers?

[+] hanman9|9 years ago|reply

Thanks, Jeremy! My background is in chip design verification; Looks like a great starting point for me to dabble in Deep learning.

[+] throwaway2016a|9 years ago|reply

I know it says graduate level math not required but it does need some math correct? What is the minimum?

[+] rsp1984|9 years ago|reply

Will this offer stay free?

[+] ddlutz|9 years ago|reply

Thanks for announcing this, looks amazing! As somebody that's been toying around with deep learning and machine learning, I've been wondering what the steps are to move from 'cool example' to viable product. I know somebody else mentioned something general like that in another comment, but I had a concrete example.

For instance, it's extremely easy to set up an MNIST clone and achieve almost world-record performance for single character recognition with a simple CNN. But how do you expand that to a real example, for instance to do license plate OCR? Or receipt OCR? Do you have to do two models, 1 to perform localization (detecting license plates or individual items in a receipt) and then a second model which can perform OCR from the regions detected from the first model? Or are these usually done with a single model that can do it all?

I'm not sure if answering these questions is a goal of your course, or if they're perhaps naive questions to begin with.

[+] jph00|9 years ago|reply

They are excellent questions and that indeed is exactly the goal of this course. I hope you try it out and let us know if you find the answers you need.

For this particular question, a model that does localization and then integrated classification is called an "attentional model". It's an area of much active research. If your images aren't too big, or the thing you're looking for isn't too small in the image, you probably won't need to worry about it.

And if you do need to worry about it, then it can be done very easily - lesson 7 shows two methods to do localization, and you can just do a 2nd pass on the cropped images manually. For a great step by step explanation, see the winners of the Kaggle Right Whale competition: http://blog.kaggle.com/2016/01/29/noaa-right-whale-recogniti...

(There are more sophisticated integrated techniques, such as the one used by Google for viewing street view house numbers. But you should only consider that if you've tried the simple approaches and found they don't work.)

[+] Eridrus|9 years ago|reply

The big struggle with deep models is their thirst for data.

MNIST is considered a simple toy example, and it has 50k images spread across 10 classes.

ImageNet has 1m images spread across 1k samples.

One of the things that has made image recognition in the form of categorisation easier is that using a network pre-trained on ImageNet, and then finetuning it to your task actually works pretty well and requires far fewer images.

The struggle with doing something like license plate OCR is that it's unlikely that doing that you can transfer the learning from ImageNet to your target task.

So, in reality your struggle is going to be more around the data than the model. If you already had a system deployed that was getting data in and you were getting some feedback of when your model failed, then this problem would be easily solved, but if you're building from scratch this is going to be your biggest problem.

And since you don't necessarily know ahead of time how easy or hard your problem is, you don't know how many samples you will need or how much it will cost you.

So, if you did actually want to build a license plate reader using deep learning, my suggestion would be to try and artificially create a dataset by generating images that look like license plates and sticking them in photos in the state you expect to see them in (i.e. blurred, at weird angles, etc) and then training a neural net to recognise them. That would give you a sense for how hard the problem is, and how much data you will need to collect.

In terms of the model; I would probably just try having 6 outputs with 36 classes per output corresponding to the characters/digits in order. I don't know if it will work well, but it's a good baseline to start with before trying more complicated things like attention models or sequence decoders (https://github.com/farizrahman4u/seq2seq )

[+] thrawy45678|9 years ago|reply

I see a lot of criticism about tmux and other non-core items being included in the overall curriculum. I think the author is trying to portray the workflow he is currently on and exposing the full tool kit he uses. I don't think he is saying - this is "THE" approach one has to follow. I for one think that this is a perfectly legitimate way of teaching. People can leave out pieces which do not interest them or substitute them with other tool sets if they chose to.

For me the key take away here is that - some one who has been a consistent beater in the Kaggle competition for two years and the founder of a ML company is teaching a "hands-on" course which fills a gap (from tech talk to step-by-step hands on) and I think I can live with this method of teaching.

[+] pkmital|9 years ago|reply

Just going to through this out there, but this course is also geared for beginners and has been loads of fun to create: https://www.kadenze.com/courses/creative-applications-of-dee...

[+] laughingman1234|9 years ago|reply

How does this compare to udacity's deep learning course?

Or should I take a more theoretical course such as Andrew Ng's to get into ML?

[+] derekmcloughlin|9 years ago|reply

If you've no experience with ML stuff, you might want to start with Andrew Ng's course, which has a small bit on neural networks (MLP and backpropagation), with examples in Matlab/Octave. I found it useful, along with "Make Your Own Neural Network" by Tariq Rashid. Very good intro to coding MLPs directly in Python.

[+] cr0sh|9 years ago|reply

Ng's course would be a great place to start, imho - though I am a bit biased: I started out my journey by taking the ML Class in 2011. Lot's of "concretely's" strewn about!

Anyhow, it was a great introduction, and light on the calc (but more emphasis on probability and linear algebra). If you have matlab or octave experience, it will also help (I didn't - the revelation of having a vector primitive was wonderful once I got the swing of it, though).

Note again, though, that I took the ML Class - not the Coursera version; I have heard that they are identical, but it has been 5+ years since I took it, too.

[+] morenoh149|9 years ago|reply

Or how does it compare to udacity's intro to machine learning? https://www.udacity.com/course/intro-to-machine-learning--ud... it was recommended in https://medium.com/learning-new-stuff/machine-learning-in-a-...

[+] tom_wilde|9 years ago|reply

If you're getting started I totally recommend Andrew NGs course...

[+] unknown|9 years ago|reply

[deleted]

[+] cr0sh|9 years ago|reply

I'm currently taking the Udacity Self-Programming Car Engineer nanodegree; I'm currently working on the lab right before the 2nd project lesson (I'm in the November cohort). That lab is to re-create the LeNET 5 CNN in TensorFlow (we have to re-create the layers of the convnet).

Last night I spent an hour or so getting my system (Ubuntu 14.04 LTS) set up to use CUDA and cudnn with Python 3; setting up the drivers and everything for TensorFlow under Anaconda - for my GTX-750ti.

That wasn't really straightforward, but I ultimately got it working. It probably isn't needed for the course, but it was fun to learn how to do it.

I would like to take this fast.ai course as well, but so far the Udacity one is eating all of my (free) time. Maybe I can give it a shot in the future.

[+] mmcclellan|9 years ago|reply

I have used and recommend [nvidia-docker](https://github.com/NVIDIA/nvidia-docker) for exploration (at least for those with a docker-capable kernel as the base OS).

[+] mikealche|9 years ago|reply

Hey, so I know that 0.90 isn't much... But is there a difference between running the programs on AWS than on my local computer with my mediocre video card? Will they take much longer to trian? I honestly don't know the difference so I'm asking

[+] derekmcloughlin|9 years ago|reply

Paperspace (https://www.paperspace.com/ml) have ML instances starting at $0.27 per hour. Their more general GPU offering is $0.60 per hour for an 8-core machine with an 8GB GPU. See https://news.ycombinator.com/item?id=13185464 for previous discussion.

[+] slashcom|9 years ago|reply

Your mediocre graphics card may or may not have CUDA support. If it doesn't, or it only supports an old version, then it's the same as not having a graphics card at all. And of course, Nvidia only.

If it supported, then the major difference is GPU memory, which limits the size of the network you can train. The newest models are faster than some 1-2 year old ones, but older hardware does the job fine.

[+] cicero19|9 years ago|reply

Hi Jeremy, looking forward to going through these. Thanks for sharing. Very noble decision to make them free.

I am wondering, why you chose to leave Enlitic and start fast.ai?

[+] jph00|9 years ago|reply

Probably too long a story for an HN comment! ;)

In short, my wife got sick and needed brain surgery while she was pregnant, and I ended up being away from Enlitic for nearly a year. It made me reassess what I really wanted to do with my time.

Now that I spend all my time coding and teaching, I'm much happier. And I think that making deep learning more accessible for all will be more impactful than working with just one company. Deep learning has been ridiculously exclusive up until now, on the whole, and very few people are researching in the areas that I think matter the most.

Finally, I think I achieved what I set out to do with Enlitic - deep learning for medicine is now recognized as having great potential and lots of folks are working on it.

[+] codingdave|9 years ago|reply

Thanks for sharing this -- I've been doing hobby-level work with computer vision on the side for a couple years now, but always kinda hit a wall when moving beyond anything trivial. I'll give this a shot and see where it takes me!

[+] jph00|9 years ago|reply

Yeah I know just what you mean. Check out the feedback from Christopher Kelly here, who describes something very similar: http://course.fast.ai/testimonials.html . Perhaps you'll find some inspiration there...

I really hope that you get past the wall! If you do find yourself getting stuck, demotivated, etc, please do come join the community on the forums, since they can really help overcome any issues you have: http://forums.fast.ai/

[+] flik|9 years ago|reply

>>you need access to a computer with an Nvidia GPU, along with a python-based deep learning stack set up on it.

Is NVidia GPU a must have... can I workaround?

[+] jph00|9 years ago|reply

Yes, you can get by with an AMD GPU but it'll be a lot more work, and a lot more frustration.

It's worth spending the $0.90 per hour to use AWS, or less if you get spot instances.

You can do much of your prototyping on just a CPU, and only run on the GPU on more data once it's working well.

[+] girfan|9 years ago|reply

This is absolutely amazing. Thanks for the effort. How do we join the slack group for interactive help/questions?

[+] jph00|9 years ago|reply

Our preference is for most questions to go through the forums. When you join the forums you'll get more information on why we think that's best for the overall community (in short, because it's easier for others to find answers when they are organized by topic), and also how to access the Slack channel.

[+] kp25|9 years ago|reply

Can i use google compute engine instead of aws ?

[+] jph00|9 years ago|reply

Last time I checked they didn't have GPU virtual machines. So no, probably not. If there is a way to run jupyter notebooks on GPU machines on Google, I'd certainly be interested in learning about it.

[+] bogomipz|9 years ago|reply

This is great! Thanks for sharing!

[+] is_this_tinder|9 years ago|reply

[deleted]

91 comments