Nassim Taleb: We should retire the notion of standard deviation

[+] Homunculiheaded|12 years ago|reply

I sometimes think that progress in the 21st century will be summed up as: "The realization that the normal distribution is not the only way to model data".

Taleb's favorite topic is the "black swan event" which is something that the normal distribution, and the idea of standard deviation, don't model that well. In a normal distribution very extreme events should only happen once in the lifetime of several universes. Of course assuming variation inline with a Gaussian process is at the heart of how the Black-Sholes model calculates risk/volatility/etc.

Benoit Mandelbrot argued that financial markets follow a distribution much more similar to the Cauchy distribution (specifically the Levy distribution) rather than a Gaussian. The problem of course is that the Cauchy distribution is pathological in that it doesn't have a mean or variance, you can calculate similar properties for it (location and scale), but it doesn't obey the central limit theorem so in practice it can be very strange to work with.

The normal distribution is fantastic in that it does appear frequently in nature, is very well behaved, and has been extensively studied. However a great amount of future progress is going to come from wrestling with more challenging distributions, and paying more attention to when assumptions of normality need to be questioned. Of course one of the challenges of this is that the normal distribution is baked into a very large number of our existing statistical tools.

[+] jerf|12 years ago|reply

This is actually what I expected to read: "The standard deviation is useful because with the average and the standard deviation, one can fully characterize a normal distribution. However, the standard deviation is less useful a statistical summary the farther away from 'normal' you get, and in reality, there is no such thing as a normal distribution, as a true normal distribution is defined on the entire real number line from negative infinity to positive infinity. Reality always provides some bound, and it's often quite distorted from Guassian. For instance, a 'normal' distribution averaging 2 with a standard deviation of 1.4, bounded by 0, is quite non-Gaussian in many important ways! (Not least of which is that you're going to have to do something to replace the missing probability...)

"People rarely check how closely their data conform to the standard distribution; indeed, many people blindly apply the standard deviation to their data regardless of its distribution! The resulting number is often more obfuscatory than helpful, to the extent that it crowded out more useful summaries.

"It's a useful metric when treated carefully, but it is rare to encounter it treated carefully. Science courses would be well-served to stop teaching it in favor of a stronger emphasis on multiple distributions. (Multiple distributions are usually touched upon, but implicitly our curricula overfavor the Gaussian distribution and end up accidentally implicitly convincing students its the only one.)"

But that's just me.

[+] Helianthus|12 years ago|reply

>"The realization that the normal distribution is not the only way to model data".

Realization by who? If you understand the normal distribution you had damn well better know that there are other probability distributions.

>The problem of course is that the Cauchy distribution is pathological in that it doesn't have a mean or variance, you can calculate similar properties for it (location and scale), but it doesn't obey the central limit theorem so in practice it can be very strange to work with.

In other words, we're using the normal distribution as the workhorse because considering other distributions is, well, inefficient/unproductive.

> However a great amount of future progress is going to come from wrestling with more challenging distributions, and paying more attention to when assumptions of normality need to be questioned

What exactly is it that you think physicists have been doing for the past half century? The error accounting for CERN's experiments requires actual millions of PhD-hours.

This topic's conversation is at some bizarre intersection of good intentions, concrete knowledge, and woeful ignorance. I guess I tar myself with that brush.

[+] n00b101|12 years ago|reply

Taleb has a good point about people mistakenly interpreting standard deviation (sigma) as Mean Absolute Deviation (MAD). I like that he gives some conversions (sigma ~= 1.25 * MAD, for Normal distribution).

I think it's rather silly to talk about "retiring" standard deviation, but we can't blame Taleb - the publication itself posed the question "2014: What Scientific Idea is Ready for Retirement?" to various scientific personalities.

What Taleb failed to mention is that, once properly understood, standard deviation has distribution interpretations that can be much more useful than MAD. For example, if the data is approximately normally distributed, then there is approximately a 99.99% probability that the next data observation will be <= 4 * sigma.

Not everything is approximately normally distributed, but a lot of phenomena ARE normally distributed. It's a well known fact that the phenomena which Taleb is most interested in (namely, financial return time-series) are not normally distributed. But I would like to know how Taleb proposes to "retire" volatility (sigma) from financial theory and replace it with MAD? Standard deviation is so central in finance that even the prices of some financial instruments (options) are quoted in terms of standard deviation (e.g. "That put option is currently selling at 30% vol"). How do we rewrite Black-Scholes option pricing theory and Markowitz portfolio theory in terms of MAD and remove all the sigmas everywhere? Surely Taleb has already written that paper for us so that we can retire standard deviation?

[+] regularfry|12 years ago|reply

I think his point is that Black-Scholes et al are holed beneath the waterline precisely because they involve standard deviations. In his world, you're better off being unable to price an option than you would be with Black-Scholes. Your example of "That put option is currently selling at 30% vol" is actually an example of why the system is so completely broken: if volatility as standard deviation was valid, all options against the same underlying instrument would have the same implied volatility. The volatility smile shouldn't exist.

This wouldn't matter if the down-side wasn't so crippling.

I don't think Taleb has to be the one to propose a replacement for portfolio theory, and I think criticism of him for not doing so is pointless. You don't need to have a spare tire handy to point out that your neighbour's car has a flat, and you don't have to run an airline to tell people not to get on a plane with the engines visibly on fire.

[+] repsilat|12 years ago|reply

> For example, if the data is approximately normally distributed, then there is approximately a 99.99% probability that the next data observation will be <= 4 * sigma.

Could you sketch out why similar statements couldn't be made about MAD? My (possibly flawed) intuition is that the expected proportion of observations within n\*MAD should be similarly independent of the parameters of the normal distribution.

[+] JASchilz|12 years ago|reply

The central limit theorem shows us that unimodal data with lots of independent sources of error tends towards a normal distribution. That description is a good first-pass, descriptive model for lots and lots of contexts, and standard deviation speaks well to normally distributed data.

Squaring error isn't just a convenient way to remove sign, it's driven by a lot of data-sets' conformance to the central limit theorem.

[+] ClementM|12 years ago|reply

This article is based on paper Taleb published in 2007. If you want to test yourself, submit yourself to experiment in page 3: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=970480

[+] programminggeek|12 years ago|reply

I think because it's called "standard deviation" that it sounds like the thing to use or look for. It sounds more correct because of the word standard.

I feel like it is the same kind of failing due to human perception of language that programmers have with the idea of exceptions and errors, especially the phrase "exceptions should only be used for exceptional behaviors". That's a cool phrase, but people latch on to it because of the word exception sounding like something extremely rare and out of the ordinary whereas we see errors as common, but they are in fact the same thing. Broke is broke, it doesn't matter what you call it, but thousands of programmers think differently because of the name we gave it.

We are human and language absolutely plays a role in our perception of things.

[+] klodolph|12 years ago|reply

> I think because it's called "standard deviation" that it sounds like the thing to use or look for.

Yes! Because it's an awesome trick and lets you do good estimates on napkins.

The other day I was buying lunch at a food cart and thought about how much change the food carts had to carry, as a function of how many customers they have, under the assumption that they want to be able to provide correct change to 99% of their customers.

Let's say that the average amount of change a customer needs is $5, and a 99-th percentile customer needs $15 in change. If we pretend that the distribution is approximately Gaussian we can calculate that 1,000 food carts with 1 customer each would need $15,000 in change, but 1 food cart with 1,000 customers would need $5 x 1,000 + ($15 - $5) * sqrt(1,000) ≈ $5,320. That's math you can do in your head without a calculator (being a programmer, 1,000 ≈ 2^10 so sqrt(1,000) ≈ 2^5).

The standard deviation and assumptions of normality are so useful because of the central limit theorem. That is, if you have many iid variables which have finite standard deviation the sum will converge to a Gaussian distribution as the number of variables increases.

Then you say "Well, the standard deviation weighs the tail too heavily" and the response is "well use higher order moments then, that's what they're they're for".

[+] cheald|12 years ago|reply

I really tried to get through "The Black Swan" and Taleb's writing struck me as so pretentious and self-involved that it made it impossible for me to finish.

He strikes me as someone who is so desperate to be important and recognized that an assertion like this doesn't really surprise me.

[+] Alex_MJ|12 years ago|reply

His facebook doesn't inspire a lot of confidence. It's full of vague availability-heuristic-targeting generalities.

"Virtue is when the income you wish to show the tax agency equals what you wish to show your neighbor"

"The problem is that academics really think that nonacademics find them more intelligent than themselves",

etc

I've read black swan as well, and there were parts I didn't quite grok at the time (You cannot predict a black swan! etc)

My take is that he's a Malcolm Gladwell with numbers and therefore less easily takedownable but has the same model of "hold prestigious academic position, appear wise, publish book with simple principle, be easily referenceable by people who want to sound well-read" except he has lots of math and charts to point at lest anyone call him out on it.

[+] triplesec|12 years ago|reply

Talebs problem is an intense narcissism, allied with strong intelligence, but not as strong as his narcissism would leave him to believe. So, we have to listen carefully to find his useful insights, while wading through the bombast.

Don't try to engage with him critically, but constructively, directly on the internets though, because he prefers acolytes to arguments!

[+] calroc|12 years ago|reply

I started reading that book and I thought he was a genius, but then, right abut the point where he starts bagging on the Uncertainty Principle, I realized that he's actually kind of an idiot.

The book does make a good point though.

[+] olifante|12 years ago|reply

Not really a surprise for someone who was so in thrall to Benoît Mandelbrot, another world-class narcissist.

Here's Taleb on Mandelbrot: "[He] had perhaps more cumulative influence than any other single scientist in history, with the only close second, Isaac Newton."

That quote really says it all.

[+] andrewcooke|12 years ago|reply

fascinating.

[+] scythe|12 years ago|reply

While the mean deviation as presented is slightly nicer than sigma for intuitive purposes, it isn't as appropriate (iirc) for statistical tests on normal distributions and t-distributions.

More importantly, it doesn't fix the real problem, which is that the mean and standard deviation don't tell you everything you need to know about a data set, but often people like to pretend they do. It's not rare to read a paper in the soft sciences which might have been improved if the authors had reported the skewness, kurtosis, or similar data which could shed light on the phenomenon they're investigating. These latter statistics can reveal, for instance, a bimodal distribution, which could indicate a heterogeneous population of responders and non-responders to a drug, and that's just one example.

I'm not a statistician, so some of this might be a bit off.

[+] bluecalm|12 years ago|reply

So first about the article:

>>The notion of standard deviation has confused hordes of scientists

What an assertion! It also proved to be very useful for hordes of scientists... what about some examples of confused scientists ?

>>There is no scientific reason to use it in statistical investigations in the age of the computer

As someone who uses it daily I am eagerly awaiting his argument.

>>Say someone just asked you to measure the "average daily variations" for the temperature of your town (or for the stock price of a company, or the blood pressure of your uncle) over the past five days. The five changes are: (-23, 7, -3, 20, -1). How do you do it?

Ok... if I am to calculate the average I am calculating the average if I need to know standard deviation I calculate standard deviation...

>> It corresponds to "real life" much better than the first—and to reality.

What the flying fuck. What "real life" ? Standard deviation tells you how volatile measurements are not what mean deviation is. Those are both very real life things just not the same thing.

>>It is all due to a historical accident: in 1893, the great Karl Pearson introduced the term "standard deviation" for what had been known as "root mean square error". The confusion started then: people thought it meant mean deviation.

I don't know how one can read it and not think: "is this guy high or just stupid?".

>>. The confusion started then: people thought it meant mean deviation.

I am yet to see anybody who thinks that standard deviation is mean deviation. It's Taleb though. Baseless assertions insulting groups of people are his craft.

>>What is worse, Goldstein and I found that a high number of data scientists (many with PhDs) also get confused in real life.

One example please ? I can give hundreds when std dev is useful and mean deviation isn't. Anything when you decide what % of yoru bankroll to bet on perceived edge for example.

Ok so he asserted that people should just use mean deviation instead of mean of squares. Guess what though, taking the squares have a purpose: it penalizes big deviations so two situations which have the same mean deviation but one is more stable have different standard deviations. THis information is useful for many things: risk estimation or calculating sample size needed for required confidence (if you need more experiments, how careful should you be with conclusions and predictions etc). He didn't mention how are we going to achieve those with his proposal. Meanwhile he managed to throw insults towards various groups without giving one single example of misuse he describes.

This is not the first time he writes something this way. His whole recent book is like that. It's anti-intellectual bullshit with many words and zero points. He doesn't give any arguments, he throws a lot of insults, he misues words and makes up redundant terms which he then struggles to define. The guy is a vile idiot of the worst kind: ignorant and aggressive. Him gaining so much following by spewing nonsense like this article is for sure fascinating but there is no place for him in any serious debate.

[+] azakai|12 years ago|reply

>> The notion of standard deviation has confused hordes of scientists

> What an assertion! It also proved to be very useful for hordes of scientists... what about some examples of confused scientists ?

He is exaggerating, for sure. But the point is valid: the mean average deviation (MAD) is often very different than the standard deviation (STD), and the MAD is more intuitive, it has a natural geometrical interpretation - STD's usage of squaring the distance makes it more complex.

And yes, this confuses people in some cases, including scientists. Many scientists are not statistical experts, they use tools as they were taught, and they often assume MAD is approximately the STD, because it usually is, except in rare cases when it is not. I've seen examples of those people in grad school, he is not making this up.

The STD is far more easy to analyze in a mathematical way. That is the huge value it brings - squaring is an operation you can take the derivative of, but absolute value you cannot. STD gives us nice properties like easily provable sum of variances is the variance of the sum, for independent variables.

MAD, however is nicer for reporting data since it is more intuitive. I think he makes a valid point that STD is used more frequently than it should be.

> Ok so he asserted that people should just use mean deviation instead of mean of squares. Guess what though, taking the squares have a purpose: it penalizes big deviations so two situations which have the same mean deviation but one is more stable have different standard deviations.

His point is that many people are not aware of that property and do not want it.

[+] tptacek|12 years ago|reply

This comment was great fun to read, and I suspect that I share some of your opinions about Taleb (although mine are decaying with each day that passes since I last finished reading any of his books), but it would have been stronger as an argument if you had left off the part about him being a "vile idiot".

[+] edw519|12 years ago|reply

Counterexample:

  |                           ..
  |                         .    .                 
  |    waste      |       .        .       |     stuff
  |     of        |      .          .      |       I
  |     my        |     .            .     |     don't
  |    time       |   .    stuff I'll  .   |     grok
  |               | .        read        . |       
  |(Nassim Taleb) |                        | .       
  |       .       |                        |       . 
  |______________________________________________________
          -3     -2     -1     0     1     2     3
                       standard deviations

[+] shas3|12 years ago|reply

You conveniently leave out the reasonable portions of his article. I read your comment before I read the article, and guess what, you have cherry picked specific parts of the article such that Taleb comes off as stupid.

It took me ten seconds to figure out what 'Goldstein and Taleb' refers to: http://www-stat.wharton.upenn.edu/~steele/Courses/434/434Con...

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=970480

He specifies that SD is reasonable as a mathematical tool, but not for inference about society, finance, etc.

Your comment appears to have an agenda or ideology underlying it.

Further you don't seem to have read the article fully.

>Ok so he asserted that people should just use mean deviation instead of mean of squares. Guess what though, taking the squares have a purpose: it penalizes big deviations so two situations which have the same mean deviation but one is more stable have different standard deviations. THis information is useful for many things: risk estimation or calculating sample size needed for required confidence (if you need more experiments, how careful should you be with conclusions and predictions etc). He didn't mention how are we going to achieve those with his proposal. Meanwhile he managed to throw insults towards various groups without giving one single example of misuse he describes.

That is pretty much what his entire article is about: taking squares may not be the best idea universally.

[+] tel|12 years ago|reply

He's inflammatory. Really, really inflammatory. He also tends to write about obvious technical points so if you do pick through the morass and find the gem you'll be a bit underwhelmed. In this instance, standard deviations are obviously important statistics to compute and have many, many useful properties, sure, but Taleb rightly points out that they can be misused and I agree with him that their misuse is endemic in science and a downright travesty outside of it.

But to just say that some tool is both useful and misusable is boring and wouldn't cause people to talk about this nearly as much.

I don't really like Taleb's style, but I can't deny that it causes conversation about things that people often would not give a second thought. In that way, I can see the outlines of a really great point underlying his inflammatory rhetoric.

Don't hate the player? At least he's arguing that people ought to be smarter and worry about what data really means.

[+] ThomPete|12 years ago|reply

Context I think

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=970480

And I think you should pay more attention to this:

"...as it does more harm than good—particularly with the growing class of people in social science mechanistically applying statistical tools to scientific problems..."

if you want to understand where he is coming from.

[+] bnegreve|12 years ago|reply

> The guy is a vile idiot of the worst kind: ignorant and aggressive.

Your comment is a bit irritating. I don't know the author of this blog post but I think he has a couple of valid points. You seem to dislike him for some other reasons. Feel free to share them. The blog post is still perfectly reasonable.

Anyway, I agree some of your comments as well, namely this:

> I can give hundreds when std dev is useful and mean deviation isn't. Anything when you decide what % of yoru bankroll to bet on perceived edge for example.

and this:

> Guess what though, taking the squares have a purpose: it penalizes big deviations so two situations which have the same mean deviation but one is more stable have different standard deviations.

But I don't see why this comment from N.T. is stupid:

>>>It is all due to a historical accident: in 1893, the great Karl Pearson introduced the term "standard deviation" for what had been known as "root mean square error". The confusion started then: people thought it meant mean deviation.

I actually agree that standard deviation is a confusing name, why do you think that's a stupid comment?

[+] lisper|12 years ago|reply

> One example please ? I can give hundreds when std dev is useful and mean deviation isn't.

It is ironic that you take Taleb to task for not providing an example, and then you do exactly the same thing that you are excoriating him for. Claiming that you have hundreds of examples is not the same as actually presenting one.

[+] Eliezer|12 years ago|reply

The true case against the standard deviation is more or less impossible to make without going into the math. For a mathematically advanced Bayesian, the idea is straightforward if you've already been through the lawful derivation of Gaussians from their underlying assumptions: Using standard deviations in your thinking makes sense if you think that an increasingly large error has a decreasing log-probability that goes as the square of the size of that error. The distribution that you get from the sum of independent variables is the classic example. Basically, if you think in standard deviations, then all postulates of the form "The error was that size" will come with probability penalties that you trade off against each other in proportion to the square of the size of that error. Thinking in standard deviations similarly corresponds to models where exceptional events become unlikely in proportion to the square of the size of that exception, only by "unlikely" you have to read log probability rather than probability, and log probability is (always) the right way for Bayesians to think about it anyway for reasons I won't go into.

And then the critique of "standard deviation" is that people got taught SD as a statistical tool that you just pull out any time you feel like it, and they don't know what it means in underlying probability-theoretic terms as an assumption about either the world or our own uncertainty, and so it's misused horribly on all sorts of occasions.

I'd guess that SDs are appropriate 40-90% of the time depending on which field you work in, but without a lot of Bayesian background with fairly advanced math people will not be able to really appreciate what the other 60-10% of times look like. And the state of education is not anywhere near like that. It's just people being taught to calculate the standard deviation cause, like, that's something you do. They don't know what assumptions it corresponds to even in the cases where SD does apply.

Burning SDs to the ground and starting over would not be very much amiss in one of the fields where SDs only make sense 40% of the time, but the practitioners are using them all the time. (Machine learning is one of the fields where SDs make sense maybe 40% of the time, and if I found an ML practitioner who had been taught to think in terms of squared error I'd send them off to learn the underlying probability theory until their entire worldview had been translated into likelihood functions instead.)

[+] wglb|12 years ago|reply

I am wondering if you have a lot invested in concepts and techniques that Mr Taleb is suggesting are broken. There is certainly a tone of defensiveness about your post.

I think it is still unexplained why the financial community is sticking to things that are known to be broken. It is like trying to use newtonian mechanics to describe chaotic phenomena. Standard deviation does not properly describe fractal stuff.

Feels like the industry is looking for the lost set of keys on a dark street, not where they were last seen but rather under the lamptpost, cause that is where the light is.

[+] drewblaisdell|12 years ago|reply

I agree with some of what you wrote, but I think you should reread your reply to Taleb with as much scrutiny as you read Taleb.

[+] canvia|12 years ago|reply

Take this with a grain a salt. Taleb has made a lot of enemies with his writings about the bank bailouts and gov inefficiency. There are a lot of reasons for people to attack him, especially anonymously on the internet.

I am not saying that the arguments are without merit, just to be cautious in forming an opinion on the man.

[+] sliverstorm|12 years ago|reply

The funny thing is I've never used absolute deviation before, so I'm slightly confused by absolute deviation right now as I wrap my head around what it means relative to sigma (which I have used, lightly, for years) and why anyone would want to use it instead...

edit: apparently MAD can refer to either "mean absolute deviation" or "median absolute deviation". Yup, this sure isn't going to confuse anybody.

[+] bravura|12 years ago|reply

>>>There is no scientific reason to use it in statistical investigations in the age of the computer

>> As someone who uses it daily I am eagerly awaiting his argument.

Compute-intensive statistical tests, which are data-driven and make fewer assumptions about the underlying distribution, can give tighter confidence intervals and detect statistical significance better.

For example, stratified shuffling.

[+] dnc|12 years ago|reply

You will find example here: http://www-stat.wharton.upenn.edu/~steele/Courses/434/434Con...

There is relatively simple, yet illustrative problem described on the first page. The solution is on the second page. Try to solve it w/o looking the solution.

[+] RivieraKid|12 years ago|reply

> Standard deviation tells you how volatile measurements are not what mean deviation is.

Why would you use sd to convey the amount of volatility? Isn't the mean deviation much more easily understood?

[+] washedup|12 years ago|reply

His point has always been to cut off the tails. The MAD actually allows you to cut off the tails sooner, and that moves outside of your chosen confidence interval will be seen as tail risk much sooner than if you use STD. For example, think about the following example of normal returns versus S&P returns:

http://managed-futures-blog.attaincapital.com/wp-content/upl...

A distribution built from MAD will more closely resemble the reality of the S&P returns, and will create more robust models. There are many examples of this.

[+] msellout|12 years ago|reply

>> There is no scientific reason to use it in statistical investigations in the age of the computer

> As someone who uses it daily I am eagerly awaiting his argument.

The use of root mean square error|deviation comes from the ease of calculation in the days before computers. Now that we have computers we can just use mean absolute error|deviation. Before computers, we needed calculus to find the "best fit line" -- parameters that create the least distance between predicted and actual values. Calculus plays well with parabolas but not so great with absolute value functions. Now that computers are around, we don't need calculus (as much).

[+] Glyptodon|12 years ago|reply

All I know is this reminds me a lot of high school where we had to always compute std dev in problems, homework, and sometimes labs, but nobody really ever explained how to interpret it. It was always like "This is std dev. This is how you compute it. Make sure you put it your tables and report."

Eventually someone (or something) did explain it, but once I understood it, it became clear that it wasn't always a sensible thing to be asked to calculate but was instead just an instinctive requirement.

[+] spikels|12 years ago|reply

You gotta love the acronyms: STD versus MAD!

Taleb is definitely mad but his use of the MAD acronym (mean absolute deviation) is actually correct. However the STD acronym (all caps) refers to "sexually transmitted disease" and not generally used for "standard deviation". Most people use SD, Stdev, StDev or sigma.

Once again his ability to coin new terminology outstrips his ability to form coherent ideas that are anything more than trivial (eg. we have known about fat tails in stock returns for 50+ years). Like George Soros[1], Taleb's success says more about the state of the world of finance than their contributions to our knowledge.

[1]-See his book "The Alchemy of Finance"

[+] Fomite|12 years ago|reply

STD is also an obsolete term. Epidemiology, Medicine etc. most often now refer to them as STIs, or Sexually Transmitted Infections.

The reasoning for this is that many sexually transmitted infections can be acquired, passed on to others, etc. without causing any clinical symptoms. See: HPV, among others.

[+] justin66|12 years ago|reply

Taleb has a textbook draft up which is more technical than his popular writings:

http://www.fooledbyrandomness.com/FatTails.html

There might be something there for the more rabid critics. At least it will keep them off the internet for a few days...

[+] zeidrich|12 years ago|reply

It's not that we should retire the notion of standard deviation. It's more that we should understand the tools that we are using and use the appropriate tool for the job.

[+] puranjay|12 years ago|reply

NNT is my intellectual superhero but the amount of hate he gets is tremendous.

Please understand that NNT's biggest issues are not so much with the way statistical models are applied to economics and finance, but how social scientists sometimes feel compelled to apply them to social fields as well, which is plain unscientific, dumb, and mostly disastrous.

So when you bear down on his arguments, please keep this context in mind.

[+] dxbydt|12 years ago|reply

The notion of area has confused hordes of scientists; it is time to retire it from common use and replace it with the more effective one of circumference. Area should be left to mathematicians, topologists and developers selling real estate. There is no scientific reason to use it in statistical investigations in the age of the computer, as it does more harm than good.

Say someone just asked you to measure the area of a circle with radius pi. The area is exactly 31. But how do you do it?

scala> math.round(math.Pi * math.Pi * math.Pi).toInt

res1: Int = 31

Do you pack the circle with n people, count them up and verify n == 31 ? Or do you pour a red liquid into the circle and fill it up, then drain it and measure the amount of red ? For there are serious differences between the two methods.

If instead, you were asked to measure the circumference of a circle with radius pi.

scala> math.round(2 * math.Pi * math.Pi).toInt

res2: Int = 20

You just ask an able-bodied man, perhaps an unemployed migrant, to walk around this circle while another man, an upstanding Stanford sophomore, starts walking from Stanford to meet his maker, I mean VC, well its the same thing...

So by the time the migrant finishes walking around the circle, our upstanding Stanford entrepreneur is greeting the VC on the tarmac of the San Francisco International Airport. This leads one to rightfully believe that the circumference of the circle of radius pi is exactly the distance from Stanford to the SF Airport ie. 20 miles. It corresponds to "real life" much better than the first—and to reality. In fact, whenever people make decisions after being supplied with the area, they act as if it were the distance from their university to the airport.

It is all due to a historical accident: in 250BC, the Greek mathematician Archimedes introduced Prop 2, the Prevention of Farm Cruelty Act ( http://en.wikipedia.org/wiki/California_Proposition_2_(2008) ). No I believe this was a different Prop 2. This Prop 2 states that the area of a circle is to the square on its diameter as 11 to 14 (http://en.wikipedia.org/wiki/Measurement_of_a_Circle ) .The confusion started then: people thought it meant areas had to do with being cruel to farm animals. But it is not just journalists who fall for the mistake: I recall seeing official documents from the department of data scientists, which found that a high number of data scientists (many with PhDs) also get confused in real life.

It all comes from bad terminology for something non-intuitive. Despite this confusion, Archimedes persisted in the folly by drawing circles in the sand, an infantile persuasion, surely. When the Romans waged war, Archimedes was still computing the area of the circle. The Roman soldier asked him to step outside, but Archimedes exclaimed "Do not disturb my circles!" (http://en.wikipedia.org/wiki/Noli_turbare_circulos_meos)

He was rightfully executed by the soldier for this grievous offense. It is sad that such a minor mathematician can lead to so much confusion: our scientific tools are way too far ahead of our casual intuitions, which starts to be a problem with a mad Greek. So I close with a statement by famed rapper Sir Joey Bada$$, extolling the virtues of the circumference: "So I keep my circumference of deep fried friends like dumplings, But fuck that nigga we munching, we hungry." (http://rapgenius.com/1931938/Joey-bada-hilary-swank/So-i-kee...)

[+] mekael|12 years ago|reply

Possibly the best thing I've read all week. Thank you sir.

[+] lambdasquirrel|12 years ago|reply

I think we'd be better off if we recognized that there are statistical distributions in the world besides the plain old Gaussian. For example, wealth does not follow a Gaussian, so why the heck do we throw around ideas like "above average wealth"?

Is MAD any better? Definitely. But I'd like to see a visual demonstration of how well it models exponential-based distributions. How well does it describe their "shape", the skew of the tail?

[+] cwyers|12 years ago|reply

"In fact, whenever people make decisions after being supplied with the standard deviation number, they act as if it were the expected mean deviation."

Boy, is that statement useless without any kind of context, example or citation.

[+] ChristianMarks|12 years ago|reply

Climate scientists--among others--have made similar recommendations to use the absolute mean error in place of the standard deviation, depending on the application. Taleb might have cited the extensive methodological literature--for example:

Cort J. Willmott, Kenji Matsuuraa, Scott M. Robeson. Ambiguities inherent in sums-of-squares-based error statistics. Atmospheric Environment 43 (2009) 749–752.

URL: http://climate.geog.udel.edu/~climate/publication_html/Pdf/W...

[+] bayesianhorse|12 years ago|reply

Nassim Taleb somehow likes to beat up on normals...

We Bayesians have similar notions, but we usually try not to overly bully frequentist methods, the poor things. Also, being familiar with Bayesian methods, a lot of what Taleb is saying sounds vaguely familiar...

[+] randomsample2|12 years ago|reply

Standard deviation and mean absolute deviation are both useful, but I think it's silly to suggest that we all adopt exactly one measure of variability to summarize data sets. When in doubt, make a fucking histogram.

[+] thetwiceler|12 years ago|reply

It is sad that Taleb does not see the value in the standard deviation; standard deviation is far more natural, and more useful, than MAD.

For example, if X has a standard deviation of s, and Y has a standard deviation of t, then the standard deviation of X + Y is sqrt(s^2 + t^2). There is a geometry of statistics, and the standard deviation is the fundamental measure of length.

To retire the standard deviation is to ignore the wonderful geometry inherent in statistics. Covariance is one of the most important concepts in statistics, and it is a shame to hide it from those who use statistics.

Additionally, I will mention that we do not need normal distributions to make special the idea of standard deviations. In fact, it is the geometry of probability - the fact that independent random variables have standard deviations which "point" in orthogonal directions - which causes the normal distribution to be the resulting distribution of the central limit theorem.

244 comments