top | item 40765647

(no title)

Beldin | 1 year ago

> Box plots [...] assume that your data follows a bell/gaussian shape.

Not sure how to square that with this statement on Wikipedia's page on box plots:

Box plots are non-parametric: they display variation in samples of a statistical population without making any assumptions of the underlying statistical distribution[3]

discuss

order

sigmoid10|1 year ago

If you want to see why that is not fully correct you should read the article. For a box plot you need to calculate mean, variance and certain percentiles. These values don't make sense if your distribution does not follow a certain shape (because these values unambiguously define such a shape). See the examples in the article for what happens if you still try to use them in those cases. You can still extract the values of course (hence probably why wiki says they don't assume anything), but you lose significant information about the distribution. So you can no longer reverse the process.

Evidlo|1 year ago

> So you can no longer reverse the process

I've never understood this to be the purpose of a boxplot, only a means of visualizing a distribution's quartiles.

You've gotten a flood of comments from upset people, so I'll keep it short by saying that a boxplot doesn't actually do what you claim for Gaussians, as the 0 and 100 percentile "whiskers" would be at plus/minus infinity. As for a bounded bell-shaped distribution, there are several non-unique ways to define such a distribution.

JumpCrisscross|1 year ago

> For a box plot you need to calculate mean, variance

Quantiles and medians. (Plus min and max.) Non-parametric.

These335|1 year ago

Mean and variance have nothing to do with boxplots, you are mistaken.

gradstudent|1 year ago

> because these values unambiguously define such a shape

I think this is a misunderstanding, and I think it is shared by the author of the article. Boxpolots show ranges. That's it.

rcxdude|1 year ago

The mean and variance are not features of a box plot. Box plots show the quartiles, which are about the cumulative distribution.