top | item 27632117

(no title)

ianhorn | 4 years ago

Rather than being a suppressed topic, in my experience, this is a case of people talking past each other. It's like correlation versus causation (versus plain old connected definitions). It can be true that A and B are correlated, while A doesn't cause B (or neither causes the other), and while their definitions have nothing to do with each other. Like nurse and gender. They're correlated in the US, but making someone a nurse doesn't change their gender, and the definitions have nothing to do with each other. Maybe in some countries the correlation is even flipped!

Recall all the times in stats where an estimator can be an unbiased estimator of a correlation while being a biased estimator of a causal effect.

So you get some people saying it (the correlation) is correct and other people saying it (the causal effect) is incorrect. Both are right! To stop talking past each other, they need to talk about bias with respect to the correlation or bias with respect to the causal effect in this particular direction.

But what frustrates me is when the correlation side uses the (true) correlation to argue against a system being biased with regards to something else (w.r.t. a definition or w.r.t. a causal effect or w.r.t. a literal translation or w.r.t. some more complicated aspect of the system), and that harms are okay because the bias is a correct bias.

We need to work on our terminology so that we can stop talking past each other. It doesn't help that our models have weird biases in absurdly complex function spaces, but we have to progress beyond a first-stats-course one-size-fits-all definition of bias.

discuss

order

No comments yet.