top | item 23753338

(no title)

khr | 5 years ago

Agreed. I was curious enough to run the model myself so I used a tool to extract the data. The slope estimate (b=17.24) is not significantly different from zero, p=.437.

The data are here: https://pastebin.com/HhWTKZRb

discuss

order

bluenose69|5 years ago

In case anyone is interested, below is R code to read these data and compute the regression. The summary() reveals the p value for the slope to be 0.437, and that for the intercept to be 0.32.

    d <- read.table("https://pastebin.com/raw/HhWTKZRb", header=TRUE)
    m <- lm(cumulative_covid19_per100000~proportion_binge_drinkers, data=d)
    summary(m)

SubiculumCode|5 years ago

The problem is that the author is essentially claiming that running the regression for data not passing his eyeball test is, in itself, a misuse of regression...which is nonsense.

gleenn|5 years ago

I'm not sure I understand your point. Did you actually look at the regression line through the data? It looks crazy off. I'm not a statistician but that line looks like it doesn't represent that data very well at all. People area also saying nuanced comments above but the underlying fact seems to be that this is not a good use of linear regression, and there is no strong correlation between the two axes.

gowld|5 years ago

Are you saying that eyeball tests are wrong? That's an extreme claim.

gowld|5 years ago

What are some examples of data sets with high(ish) r with high p (low confidence), and low p (high confidence) with low r?

I guess it would be a very tall, "sharp cornered" parallelogram of data points (clear slope at the average, but high error variation), vs a very short, wide rectangle?

That would be a cool explorable demo.