danger's comments

danger | 13 years ago | on: Predicting March Madness

FWIW, I think the title change removes a lot of information about what the post is about.

danger | 13 years ago | on: A breakdown of how I was talked out of $100

So this strategy is really only effective when you'd like to raise rent by 9.1% per year?

danger | 13 years ago | on: How deep learning on GPUs wins datamining contest without feature engineering

Agreed. Here's a previous HN discussion on that topic: http://news.ycombinator.com/item?id=4281630

danger | 13 years ago | on: Machine learning for the impatient: algorithms tuning algorithms

Another way to say what you're proposing is to use a linear predictor, and to train it by globally optimizing 0-1 loss (just the number of mistakes that you make) on the full set of data that you have. Even ignoring computational issues (this can be shown to be NP-hard), you seem to be making several mistaken assumptions. I'd really recommend reading some basic stuff on generalization, but a couple of the mistaken assumptions are as follows:

1. That the particular input features that you've chosen are somehow the only possible choice. But who's to say that you shouldn't add new features which are the square of each original feature? Or maybe some cross product terms, like the product of the ith feature times the jth feature. Or maybe some good features to add would be the distance from each point you've seen so far. Etc. Continuing down this path, you basically get to the question discussed in the OP about choosing a kernel for SVMs. This is just one example where hyperparameters come into play, and you need some method for choosing them.

2. That a linear predictor is impervious to overfitting. Consider the extreme case (which comes up often) where you have millions or billions of features and far fewer examples (e.g., if features are n-gram occurrences in text, or gene expression data). Then it's likely that there are many settings of weights that fit the data perfectly, but there's no way to tell if you're just picking up on statistical noise, or if you've learned something that will make good predictions on new data that you encounter. In both theory and practice, you need some form of regularization, and along with this comes more hyperparameters, which need to be chosen.

Finally, by your reasoning, it seems like you would always choose a 1-nearest neighbors classifier [1] (because it will always end up with 0 error under the setting your propose). But there's no reason why this is in general a good idea.

[1] http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm

danger | 13 years ago | on: Machine learning for the impatient: algorithms tuning algorithms

I'd say the closest thing to a cohesive community would be the MetaOptimize Q&A forum, but maybe others have other suggestions:

http://metaoptimize.com/qa

danger | 13 years ago | on: Machine learning for the impatient: algorithms tuning algorithms

Also, for people who are interested in the application of predicting college basketball with machine learning, there's a Google group that is worth joining:

https://groups.google.com/group/machine-march-madness

danger | 13 years ago | on: Machine learning for the impatient: algorithms tuning algorithms

As another commenter pointed out, the accuracy really needs to be evaluated using a validation set, not the test set--the approach described in the post is training with the testing data. In the field, we call this "cheating".

The basic idea of automatically tuning hyperparameters (the things this post discusses tuning with genetic algorithms) is cool, though, and is becoming a popular subject in machine learning research. A couple recent research papers on the topic are pretty readable:

Algorithms for Hyper-Parameter Optimization:

http://books.nips.cc/papers/files/nips24/NIPS2011_1385.pdf

Practical Bayesian Optimization of Machine Learning Algorithms:

http://arxiv.org/abs/1206.2944

danger | 14 years ago | on: Git repo for predicting March Madness using machine learning

Of particular note is the use of Theano for the machine learning heavy-lifting. If you do machine learning and haven't looked into Theano, you're probably making things harder on yourself than it needs to be.

danger | 14 years ago | on: March Madness for Machines, 2012 edition

Last year there was discussion where some HNers suggested that they'd be interested in participating in the competition if we allowed other forms of data: http://news.ycombinator.com/item?id=2321009

This is the early announcement this year, where we're soliciting suggestions about other forms of data to include for this year.

danger | 14 years ago | on: Stanford Class: Probabilistic Graphical Models

I did research with Daphne (and co-authored a paper with her) in my senior year of undergrad, and she demanded a high standard of work, yes, but she was an excellent supervisor. Everybody knows how brilliant she is, but she also put a lot of effort into teaching my (also undergrad) project partner and I about how to formulate a research problem, how to do research, and how to present research. The primary concern appeared to be our personal growth, not the research machine (though that's not to say that the research wasn't important).

Working with her was one of the highlights of my undergrad education, and her class was great, too.

danger | 14 years ago | on: Long-standing Google Apps bug forces users to renew domain names

See also these threads and note the number of complaints and the span of dates:

http://www.google.com/support/forum/p/Google+Apps/thread?tid...

I'm personally having the same problem today, when the auto renewal date is a month away. I've filed two tickets without response. It seems the only solution people have found is to remove their credit card information from Google Checkout.

danger | 14 years ago | on: Who Does Facebook Think You Are Searching For?

For mine to display scores, I had to remove the "[0]" after "filter": https://www.facebook.com/ajax/typeahead/search/first_degree....

With the "filter[0]" in the url, all scores came back as 0 for me.

danger | 14 years ago | on: ~=

http://www.google.com/search?q=%22!=%22

danger | 15 years ago | on: Selection Sunday: Is your March Madness prediction algorithm ready?

Yeah, unfortunately the "getting the word out" could have been done better.

danger | 15 years ago | on: Selection Sunday: Is your March Madness prediction algorithm ready?

This was the intention. Around a month ago we started asking what data people would like to use. We incorporated some of that feedback to decide what data to use for this year.

If you have other suggestions, please let us know, and we'll add it for next year (if possible). The only thing we're trying to avoid is somebody coming in with a lot of data at the last minute, beyond the point when anybody else can realistically get it incorporated into their model.

danger | 15 years ago | on: Selection Sunday: Is your March Madness prediction algorithm ready?

One other issue that comes up is "garbage time". When a game isn't close (say in the last quarter of a blow-out), the stats are basically meaningless. Does Pomeroy have good ideas about how to deal with that?

danger | 15 years ago | on: Selection Sunday: Is your March Madness prediction algorithm ready?

I haven't run it yet. It should be done in time to enter the contest, though (but it will just be a baseline--i.e., not eligible to win prizes).

danger | 15 years ago | on: Predicting March Madness: On Tournament Structure and Bracket Scoring Rules

Thanks. I (obviously) agree. The only problem is that it's not always computationally easy to optimize the thing you really care about.

danger | 15 years ago | on: George Dahl: Machine Learning for March Madness

This paper (referenced in the post) is also relevant. The application is to NBA basketball:

"Incorporating Side Information into Probabilistic Matrix Factorization Using Gaussian Processes." Ryan Prescott Adams, George E. Dahl, and Iain Murray. In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, 2010.

Paper: http://www.cs.toronto.edu/~gdahl/papers/dpmfNBA.pdf

Code: http://www.cs.toronto.edu/~rpa/code/dpmf-nba.tgz

danger | 15 years ago | on: Vicarious Systems Says Its Artificial Intelligence Is The Real Deal

There are many different actual tasks that technically are PASCAL challenges, but when people say "PASCAL VOC challenge" (Visual Object Classes), they typically mean either the _classification_ or _detection_ challenge:

Classification: For each of the twenty classes, predicting presence/absence of an example of that class in the test image.

Detection: Predicting the bounding box and label of each object from the twenty target classes in the test image.

Neither uses the full ImageNet data set. Instead, it's images from 20 classes of object, like shown here: http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2010/exam...