Artificial Intelligence Needs a Bullshit Meter

[+] tensor|9 years ago|reply

This is a very confused article with many errors. It appears to imply that no one validates the accuracy of machine learning models, or that they lie about the accuracy. Even more strangely, it suggest that the only way to address this is by ensemble methods.

Validation of models is one of the most important parts of any machine learning system. Every expert practitioner measures the accuracy of their models with well established methods such as cross validation or hold out tests. So the basic premise of this article seems quite at odds with the reality.

Further, the article users the security domain as an example of the lack of validation. Most applications of ML to security use unsupervised algorithms to perform anomaly detection. This is entirely a different thing to a supervised algorithm. Anomaly detection via unsupervised algorithms is well known to have many false positives.

But possible the worst error in the article is suggesting that ensemble methods are a way to validate the accuracy of a model. An ensemble technique is not a way to validate accuracy. Rather it's a way to try to obtain higher accuracy. You still need to validate your ensemble via something like cross fold validation to understand the expected error.

[+] nl|9 years ago|reply

This is a very confused article with many errors.

Au contraire! It is a good article which highlights a number of subtle points!

But possible the worst error in the article is suggesting that ensemble methods are a way to validate the accuracy of a model. An ensemble technique is not a way to validate accuracy. Rather it's a way to try to obtain higher accuracy.

Err... to quote the article: "One of the ways to improve result quality is by running ensembles of algorithms."

The talk of blending recommender systems and deep learning appears to be inspired by Google's Wide and Deep Learning[1] work, which is effectively a way of blending global and local results.

Every expert practitioner measures the accuracy of their models with well established methods such as cross validation or hold out tests.

The problem here is knowing how well the model will work with radically (or even somewhat) different data than it was trained on. This is not the same as doing CV or hold out.

For example, the ImageNet set has an enormous number of dog pictures. This means that its CV or hold-out performance tends to translate well to performance on similar datasets, and if the new dataset has a lot of dog pictures it will translate very very well.

However, if you attempt to use a network trained on ImageNet in a completely different context (classifying X-Rays for example) it is unclear how well it will perform before testing.

Further, the article users the security domain as an example of the lack of validation. Most applications of ML to security use unsupervised algorithms to perform anomaly detection. This is entirely a different thing to a supervised algorithm. Anomaly detection via unsupervised algorithms is well known to have many false positives.

Lab41 works in the intelligence space. That isn't your normal computer-security anomaly detection. Have a look at their other work[2] - there is only one thing that is conventional security log file analysis.

[1] https://research.googleblog.com/2016/06/wide-deep-learning-b...

[2] http://www.lab41.org/work/

[+] Animats|9 years ago|reply

From the article: "What I hope I have done is sufficiently piqued your interest to get you involved in Lab41." So this is an ad.

While I sort of agree that AI could use a bullshit meter, it's way better than it was in the 1980s. Today, much of the stuff actually works. Real work is done with AI. Deposit a handwritten check at an ATM and watch it be read properly. I'm amazed that works.

[+] make3|9 years ago|reply

Again with the use of the word AI for what is really just supervised (deep) machine learning. Pretty vacuous article on a subject covered at length since the dawn of machine learning by a very large amount of authors, in much more detail than here.

[+] tnecniv|9 years ago|reply

I'm curious as to what you consider AI. I tend to feel AI is a catch-all for hard problems that we don't know how to solve. Once we know how to solve something, we give it a name.

[+] dharma1|9 years ago|reply

Given who they work for, I'm surprised they open source all this - http://www.lab41.org/work/

13 comments