top | item 8251710

Machine Learning Cheat Sheet Map

114 points| cognibits | 11 years ago |scikit-learn.org | reply

17 comments

order
[+] ColinWright|11 years ago|reply
It's currently down - is it the same one discussed at some length a year ago?

https://news.ycombinator.com/item?id=5831512

Here are some other resources for machine learning, not necessarily restricted to the algorithms implemented in SciKit:

http://eferm.com/wp-content/uploads/2011/05/cheat3.pdf

http://peekaboo-vision.blogspot.ca/2013/01/machine-learning-...

http://rise.cse.iitm.ac.in/wiki/index.php/Introduction_to_Ma...

http://mlg.eng.cam.ac.uk/creed/Notes/ML_Compendium.pdf

[+] dj-wonk|11 years ago|reply
Dimensionality reduction can be a goal in and of itself, but many of the same techniques (i.e. feature selection) are useful precursors for classification, clustering, and regression. It would be nice to capture that on the diagram. More arrows, please. :)
[+] dj-wonk|11 years ago|reply
The "text data" decision point seems arbitrary and in my opinion, not useful. I've analyzed text data with a Naive Bayes a classifier as well as SVM. I really like what the chart is trying to be, but I think it is editorializing too much.
[+] scottlocklin|11 years ago|reply
That entire map is complete and utter baloney as a general guide, though it may mean something in terms of skikit-learn.

Hastie has a decent classification of the strengths and weaknesses of different learning algorithms in his book. It's not a decision tree. It will never be a decision tree.

[+] yaur|11 years ago|reply
Seeing this (a while ago) from the perspective of someone that was a complete noob to machine learning I found it extremely useful. It my not be the end-all-be-all, but I found it a useful graphic for figuring out which algorithms I should research to solve a specific ML problem.
[+] cognibits|11 years ago|reply
I guess the chart is designed to those who are doing their first steps in the machine learning world. It makes order. BTW I have nothing to do with scikit-learn, I shared it as I found it to be useful.
[+] dj-wonk|11 years ago|reply
I don't see why the diagram has SVC and ensemble classifiers located in the "not working" path from KNeighbors Classifiers. It is reasonable to use an ensemble method independent of whether nearest neighbors works.
[+] dj-wonk|11 years ago|reply
I'd suggest giving Random Forests a call-out instead of leaving them hidden under ensemble methods in the diagram. I realize this is a clarity / detail tradeoff.
[+] iLoch|11 years ago|reply
"This project has been temporarily blocked for exceeding its bandwidth threshold" I wonder why no one uses SourceForge anymore..
[+] factotvm|11 years ago|reply
I feel like that's SourceForge's job. When something gets excitement, that's the last time you want to throw up a page like this. Poor user experience. Would, say, GitHub do this?

I suppose something else could be afoot, but I feel like controls should be in place for that elsewhere, e.g. you can't upload a gigantic binary.

[+] CyberShadow|11 years ago|reply
What's going on here? The page is hosted on the scikit-learn.org domain, and I don't see any redirects or frames pointing to SourceForge. Or is SourceForge allowing projects to point arbitrary domains at it?