top | item 37122190

(no title)

Controlling for population density doesn't control for the impact of population density on the issue at hand, as these are two different factors which are related and which are named very similarly.

Consider this simplified example.

11 areas. 10 of these are isolated and have 10 people living in each area. The last area has 100,000 people living in it. Of those 10 areas, 9 have 0 incidents, and 1 has 1 incident. In the 100,000 area, there are 100 incidents.

Not controlling for the population: The populated area has 100 incidents, the isolated areas have an average of 0.1 incidents. A 1,000 times difference. Populated area is much more dangerous.

Well that's obviously the wrong way to look at the data, so lets account for population:

Isolated areas have a rate of 1 per 100 population, populated areas have a rate of 1 per 1,000 population. A 10 times difference, in the opposite direction. So now we have established a link between being isolated and have more incidents, but we don't know why.

We still haven't controlled for the impact of population density on incident rates. We need much more data to solve this, as with the given information the result would be "Isolated areas have 10 times the risk of incidents" and then controlling for that factor we would see no more trends in our data. If we added thousands of more areas with different levels of population, calculate the per capita rate of incidents in each population, and then create an analysis of how populate density relates to incident rates, we could then control for both factors. The catch is that this last step is more difficult without enough data and often researchers aren't able to isolate single individual items to control for because they correlate too strongly with other issues.

discuss

No comments yet.