[R] Highly significant intercept and large standard error

Chris Mcowen sam_smith at me.com
Wed Oct 6 16:38:04 CEST 2010


Hi Wolfgang, 

Thanks for this, it makes sense. 

I should of been more detailed when i described my model, it is in fact binomial - sell or not. 

> remove the Mag factor from the model, you get a model with just an intercept, reflecting the overall mean

This is true, but what i was trying to say ( not very well!) was i have other factors such as price (High,Mid,Low), condition ( Best,Average,Poor) etc etc and all models that have Mag in them have a much better AIC than models without Mag, and i was unsure if this was a artefact of the high SE for the MagNew rather than Mag being a key factor?

> Maybe the data have been entered incorrectly

I have checked this and all is fine, they are categorical variables not continuous so it is either MAG - New, Old or Mid. 

Sam



On 6 Oct 2010, at 15:05, Viechtbauer Wolfgang (STAT) wrote:

I do not know about the details of the model, but the results are not all that strange. I'll assume that you are using family=gaussian(), so you are essentially running a model where (Intercept) reflects the mean of the dependent variable for that third category (MagMid) of the Mag factor and MagNew and MagOld are the mean differences between MagMid and those two other categories.

If you remove the Mag factor from the model, you get a model with just an intercept, reflecting the overall mean. Two things will happen. That overall mean is essentially a weighted average of the three level-specific means. MagMid and MagOld are the most frequent categories and both these means are close to zero, so the overall mean will be pulled close to zero. Moreover, the amount of variability around the overall mean will be larger than the amount of variability around the level-specific means. This will lead to a larger standard error for the overall mean. Hence, it could very well happen that the intercept is no longer significant when you remove that factor.

Given that MagNew only occured a few times and given its very different mean and huge standard error, I suspect that some value(s) within that level are "screwy". Maybe the data have been entered incorrectly. One thing I have seen happen a few times is that missing data were coded, for example, as a -9999 in the dataset created with, for example, SPSS, but were then accidentally treated as observed values when analyzed with some other software, such as R. That could cause such a low mean for that category and the huge SE.

It's just a hunch. Could be anything, but I would certainly take another good look at the values within that level.

Best,

--
Wolfgang Viechtbauer                        http://www.wvbauer.com/
Department of Methodology and Statistics    Tel: +31 (0)43 388-2277
School for Public Health and Primary Care   Office Location:
Maastricht University, P.O. Box 616         Room B2.01 (second floor)
6200 MD Maastricht, The Netherlands         Debyeplein 1 (Randwyck)


----Original Message----
From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org] On Behalf Of Sam Sent: Wednesday,
October 06, 2010 14:03 To: r-help at r-project.org
Subject: [R] Highly significant intercept and large standard error

> Dear list,
> 
> I am running a lmer model and have a question.
> 
> When ever i put a factor (Mag) in my model it lowers the AIC of the
> model, however the intercept is the only value with  significant
> p-value. I have looked at the coefficients and the standard error and
> something jumps out at me.
> 
> 
>                              Estimate Std. Error z value Pr(>|z|)
> (Intercept)            -1.35778    0.30917  -4.392 1.12e-05 ***
> MagNew           -15.76939 1255.06372  -0.013    0.990
> MagOld            0.14250    0.25246   0.564    0.572
> 
> MagNew relates to a categorical factor (Mag) that has 3 levels of
> which New is one and Old is another ( The third is not displayed).
> 
> It appears MagNew has a huge Std.Error, what could cause this?
> 
> When i do str(Mag) you will see that New is relatively rare (29 out
> of 871) i presume it is this that is raising the Std.Error value.
> however i am not sure why this is causing the  intercept to have a
> highly significant p value . Furthermore how do i interpret it, I am
> using AIC values as my basis of model selection and i am unsure if
> this really is the most likely model or not?
> 
> Thanks
> 
> Sam
> 
> [1] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [12] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [23] Old  Old  Old  Mid     Old  Old  Old  Mid     Old  Old  Old
> [34] Old  Old  Old  Old  Mid     Old  Old  Old  Old  Old  Old
> [45] Mid     Mid     Mid     Old  Old  Old  Mid     Mid     Mid
> Mid     Old [56] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [67] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [78] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [89] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [100] Old  Old  Old  Old  Old  Old  Old  Old  Old  New New
> [111] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Mid
> [122] Mid     Mid     Mid     Mid     Old  Old  Old  Old  Mid     Mid
> Mid [133] Mid     Mid     Mid     Mid     Mid     Mid     Mid     Mid
> Mid     Mid     Mid [144] Mid     Mid     Mid     Mid     Old  Old
> Old  Mid     Mid     Mid     Mid [155] Mid     Mid     Mid     Mid
> Mid     Mid     Mid     Old  Old  Old  Old [166] Old  Old  Old  Mid
> Mid     Mid     Mid     Mid     Mid     Mid     Mid [177] Mid     Mid
> Mid     Mid     Mid     Mid     Mid     Mid     Mid     Old  Mid
> [188] Mid     Mid     Mid     Mid     Old  Mid     Mid     Mid
> Mid     Mid     Mid [199] Mid     Mid     Old  Old  Old  Old  Old
> Old  Old  Old  Old [210] Old  Old  Old  Old  Old  Old  Old  Old  Old
> Old  Old [221] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [232] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old [243] Old
> Old  Old  Old  Old  Old  Old  Old  Old  Old  Old [254] Old  Old  Old
> Old  Old  Old  Old  Old  Old  Old  Old [265] Old  Old  Old  Old  Old
> Old  Old  Old  Old  Old  Old [276] Old  Old  Old  Old  Old  Old  Old
> Old  Old  Old  Old [287] Old  Old  Old  Old  Old  Old  Old  Old  Old
> Old  Old [298] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [309] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [320] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [331] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Mid
> [342] Old  Old  Old  Old  Old  Old  Old  New New New New
> [353] New New New New New Old  Old  Old  Old  Old  Old
> [364] Old  New Old  Old  Old  Old  Old  Old  Old  Old  Old
> [375] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [386] Old  Old  Old  Old  Old  Old  Old  Old  Mid     Mid     Mid
> [397] Mid     Mid     Mid     Old  Old  Mid     Old  Old  Mid     Mid
> Mid [408] Mid     Mid     Mid     Mid     Mid     Mid     Mid     Mid
> Mid     Mid     Mid [419] Old  Old  Old  Old  Mid     Mid     Mid
> Mid     Mid     Old  Mid [430] Mid     Mid     Mid     Mid     Mid
> Mid     Mid     Mid     Mid     Mid     Mid [441] Mid     Mid     Mid
> Mid     Mid     Mid     Old  Old  Old  Old  Old [452] Old  Old  Old
> Old  Old  Old  Old  Mid     Mid     Old  Old [463] Mid     Mid
> Old  Old  Mid     Mid     Mid     Mid     Mid     Old  Mid [474] Mid
> Mid     Old  Mid     Old  Old  Old  Old  Old  Old  Old [485] Mid
> Mid     Mid     Mid     Mid     Mid     Mid     Mid     Mid     Mid
> Old [496] Old  Old  Old  Old  Old  Old  Mid     Old  Mid     Old  Old
> [507] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old [518] Mid
> Mid     Mid     Mid     Old  Mid     Old  Mid     Old  Mid     Mid
> [529] Old  Old  Mid     Mid     Mid     Mid     Mid     Mid     Old
> Mid     Mid [540] Mid     Mid     Mid     Mid     Mid     Mid     Old
> Old  Old  Old  Mid [551] Mid     Mid     Old  Old  Mid     Mid
> Old  Mid     Old  Old  Old [562] Old  Mid     Old  Old  Old  Mid
> Old  Old  Old  Old  Mid [573] Mid     Mid     Old  Old  Mid     Mid
> Mid     Mid     Old  Old  Old [584] Mid     Old  Old  Old  Old  Old
> Old  Mid     Mid     Mid     Old [595] Mid     Mid     Mid     Old
> Old  New Mid     Mid     Old  Mid     Mid [606] Mid     Old  Mid
> Old  Old  Mid     Mid     Mid     Mid     Mid     Old [617] Mid
> Old  Old  Old  Old  Old  Old  Old  Old  Old  Old [628] Old  Old  Mid
> Old  Old  Old  Old  Old  Old  Old  Old [639] Old  Old  Old  Old  Old
> Old  Old  Old  Old  Old  New [650] Old  Mid     Old  Old  Old  Old
> Old  Old  Old  Old  Old [661] Old  Old  Old  Old  Old  Old  Old  Old
> Old  Old  Old [672] Old  Old  New Old  Old  Old  Old  Old  Old  Old
> Old [683] New Old  Old  Old  Old  Old  Old  Old  Old  Old  Old [694]
> Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old [705] Old  Old
> Old  New Old  Old  New Old  Old  Old  Old [716] New New New New New
> Old  Old  Old  New Old  Old [727] Old  Old  Old  Old  Old  Old  Mid
> Old  Old  Old  New [738] Old  Old  Old  Old  Old  Old  Old  Old  Old
> Old  Old [749] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [760] New Old  Old  Old  Old  Old  Old  Old  Old  Old  New
> [771] Old  Old  Old  Old  Old  Old  Mid     Old  Old  New Old
> [782] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [793] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [804] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [815] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [826] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [837] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [848] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> [859] Old  Old  Old  Old  Old  Old  Mid     Mid     Old  Old  Old
> [870] Old  Old
> Levels: Mid New Old
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list