top | item 16841540

(no title)

smu3l | 7 years ago

Oops I did not see your response until now.

I agree, changing the model changes the estimates, because the parameters you are estimating change.

However, given one misspecified model, the parameters of that model are still well defined, though they may not have the interpretation they would if the model was correctly specified. As OP called it, this is the "best fit line", and is a projection of the truth onto your model. E.g. for a simple linear regression of Y on X, where the true conditional mean of Y given X is not linear, there is still some "true" best line. This line depends also on the distribution of X, though it would not if the model was correct. Estimates from linear regression will converge to the parameters of this line, though using the usual standard errors will be wrong.

There's a very general theorem or corollary that covers this in Asymptotic Statistics by van der Vaart. I think in the chapter about M estimators, right around where MLEs are covered, but I don't have it in front of me.

discuss

nonbel|7 years ago

There are multiple inference levels here.

First, there is the statistical level, at which we are drawing some conclusion about the model parameter. This may work even for a misspecified model.

Then there is the level at which you want to draw some conclusion about reality, call it the "scientific level". If the model is misspecified, the parameters/coefficients may or may not correspond to the thing of interest. Perhaps the model is a close enough approximation for those values to be meaningful, perhaps not...

I think it is the second ("scientific level") of inference that most people are concerned about. The rigor of the proofs/theorems that may work at the statistical level does not extend to the scientific level.

Afaict, the majority of erroneous inference occurs at the scientific level and statistical error/uncertainty is a sort of minimum error/uncertainty.