top | item 21196160

(no title)

holy_city | 6 years ago

A bigger issue I've seen is that "gut feeling" comes from a misplaced sense of confidence. Like say, sample size. I can't tell you how many times I've heard engineers say "the data isn't significant because the sample size is too small." If you have the data, calculate the confidence interval!

Most of the time you don't need hundreds to thousands of data points to be reasonably confident, just a few dozen. I remember the example distinctly from my sophomore engineering stats course, I don't know why everyone else has forgotten it.

discuss

order

m12k|6 years ago

I think most people overlook/don't know that the needed sample size depends not just on the confidence you want, but also on how big the effect you want to measure is. E.g. if landing page A has a conversion rate of 50% and B has one of 55%, that's going to take a lot of sampling to prove. But if A has 40% and B has 80%, then that's going to show up in the samples very quickly. But exactly which question you ask affects the needed sample size greatly - e.g. showing that 'B performs better than A' will take fewer samples than 'B performs at least 20% better than A'. This makes it much harder to have a correct intuition about needed samples sizes.

opportune|6 years ago

In my experience when the data is very small it is almost always also biased towards how easy it was to gather, which also makes it non representative. Think about it, if it were as easy to let n=5000 as it were to let n=25, you would always pick 5000. You only pick n=25 because of the low effort involved, which often means proximity.

A very common example is when some software feature is A/B tested only internally, or even only tested on the team that developed it. It introduces a lot of bias in users’ technical competence, willingness to understand/understanding of the new behavior, how the environment is set up, etc.