[R] Highly significant intercept and large standard error

Daniel Nordlund djnordlund at frontier.com
Wed Oct 6 17:16:59 CEST 2010


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
> On Behalf Of Chris Mcowen
> Sent: Wednesday, October 06, 2010 7:38 AM
> To: Viechtbauer Wolfgang (STAT)
> Cc: r-help at r-project.org
> Subject: Re: [R] Highly significant intercept and large standard error
> 
> Hi Wolfgang,
> 
> Thanks for this, it makes sense.
> 
> I should of been more detailed when i described my model, it is in fact
> binomial - sell or not.
> 
> > remove the Mag factor from the model, you get a model with just an
> intercept, reflecting the overall mean
> 
> This is true, but what i was trying to say ( not very well!) was i have
> other factors such as price (High,Mid,Low), condition ( Best,Average,Poor)
> etc etc and all models that have Mag in them have a much better AIC than
> models without Mag, and i was unsure if this was a artefact of the high SE
> for the MagNew rather than Mag being a key factor?
> 
> > Maybe the data have been entered incorrectly
> 
> I have checked this and all is fine, they are categorical variables not
> continuous so it is either MAG - New, Old or Mid.
> 
> Sam
> 
> 
> 
> On 6 Oct 2010, at 15:05, Viechtbauer Wolfgang (STAT) wrote:
> 
> I do not know about the details of the model, but the results are not all
> that strange. I'll assume that you are using family=gaussian(), so you are
> essentially running a model where (Intercept) reflects the mean of the
> dependent variable for that third category (MagMid) of the Mag factor and
> MagNew and MagOld are the mean differences between MagMid and those two
> other categories.
> 
> If you remove the Mag factor from the model, you get a model with just an
> intercept, reflecting the overall mean. Two things will happen. That
> overall mean is essentially a weighted average of the three level-specific
> means. MagMid and MagOld are the most frequent categories and both these
> means are close to zero, so the overall mean will be pulled close to zero.
> Moreover, the amount of variability around the overall mean will be larger
> than the amount of variability around the level-specific means. This will
> lead to a larger standard error for the overall mean. Hence, it could very
> well happen that the intercept is no longer significant when you remove
> that factor.
> 
> Given that MagNew only occured a few times and given its very different
> mean and huge standard error, I suspect that some value(s) within that
> level are "screwy". Maybe the data have been entered incorrectly. One
> thing I have seen happen a few times is that missing data were coded, for
> example, as a -9999 in the dataset created with, for example, SPSS, but
> were then accidentally treated as observed values when analyzed with some
> other software, such as R. That could cause such a low mean for that
> category and the huge SE.
> 
> It's just a hunch. Could be anything, but I would certainly take another
> good look at the values within that level.
> 
> Best,
> 
> --
> Wolfgang Viechtbauer                        http://www.wvbauer.com/
> Department of Methodology and Statistics    Tel: +31 (0)43 388-2277
> School for Public Health and Primary Care   Office Location:
> Maastricht University, P.O. Box 616         Room B2.01 (second floor)
> 6200 MD Maastricht, The Netherlands         Debyeplein 1 (Randwyck)
> 
> 
> ----Original Message----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Sam Sent: Wednesday,
> October 06, 2010 14:03 To: r-help at r-project.org
> Subject: [R] Highly significant intercept and large standard error
> 
> > Dear list,
> >
> > I am running a lmer model and have a question.
> >
> > When ever i put a factor (Mag) in my model it lowers the AIC of the
> > model, however the intercept is the only value with  significant
> > p-value. I have looked at the coefficients and the standard error and
> > something jumps out at me.
> >
> >
> >                              Estimate Std. Error z value Pr(>|z|)
> > (Intercept)            -1.35778    0.30917  -4.392 1.12e-05 ***
> > MagNew           -15.76939 1255.06372  -0.013    0.990
> > MagOld            0.14250    0.25246   0.564    0.572
> >
> > MagNew relates to a categorical factor (Mag) that has 3 levels of
> > which New is one and Old is another ( The third is not displayed).
> >
> > It appears MagNew has a huge Std.Error, what could cause this?
> >
> > When i do str(Mag) you will see that New is relatively rare (29 out
> > of 871) i presume it is this that is raising the Std.Error value.
> > however i am not sure why this is causing the  intercept to have a
> > highly significant p value . Furthermore how do i interpret it, I am
> > using AIC values as my basis of model selection and i am unsure if
> > this really is the most likely model or not?
> >
> > Thanks
> >
> > Sam
> >
> > [1] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [12] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [23] Old  Old  Old  Mid     Old  Old  Old  Mid     Old  Old  Old
> > [34] Old  Old  Old  Old  Mid     Old  Old  Old  Old  Old  Old
> > [45] Mid     Mid     Mid     Old  Old  Old  Mid     Mid     Mid
> > Mid     Old [56] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [67] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [78] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [89] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [100] Old  Old  Old  Old  Old  Old  Old  Old  Old  New New
> > [111] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Mid
> > [122] Mid     Mid     Mid     Mid     Old  Old  Old  Old  Mid     Mid
> > Mid [133] Mid     Mid     Mid     Mid     Mid     Mid     Mid     Mid
> > Mid     Mid     Mid [144] Mid     Mid     Mid     Mid     Old  Old
> > Old  Mid     Mid     Mid     Mid [155] Mid     Mid     Mid     Mid
> > Mid     Mid     Mid     Old  Old  Old  Old [166] Old  Old  Old  Mid
> > Mid     Mid     Mid     Mid     Mid     Mid     Mid [177] Mid     Mid
> > Mid     Mid     Mid     Mid     Mid     Mid     Mid     Old  Mid
> > [188] Mid     Mid     Mid     Mid     Old  Mid     Mid     Mid
> > Mid     Mid     Mid [199] Mid     Mid     Old  Old  Old  Old  Old
> > Old  Old  Old  Old [210] Old  Old  Old  Old  Old  Old  Old  Old  Old
> > Old  Old [221] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [232] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old [243] Old
> > Old  Old  Old  Old  Old  Old  Old  Old  Old  Old [254] Old  Old  Old
> > Old  Old  Old  Old  Old  Old  Old  Old [265] Old  Old  Old  Old  Old
> > Old  Old  Old  Old  Old  Old [276] Old  Old  Old  Old  Old  Old  Old
> > Old  Old  Old  Old [287] Old  Old  Old  Old  Old  Old  Old  Old  Old
> > Old  Old [298] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [309] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [320] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [331] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Mid
> > [342] Old  Old  Old  Old  Old  Old  Old  New New New New
> > [353] New New New New New Old  Old  Old  Old  Old  Old
> > [364] Old  New Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [375] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [386] Old  Old  Old  Old  Old  Old  Old  Old  Mid     Mid     Mid
> > [397] Mid     Mid     Mid     Old  Old  Mid     Old  Old  Mid     Mid
> > Mid [408] Mid     Mid     Mid     Mid     Mid     Mid     Mid     Mid
> > Mid     Mid     Mid [419] Old  Old  Old  Old  Mid     Mid     Mid
> > Mid     Mid     Old  Mid [430] Mid     Mid     Mid     Mid     Mid
> > Mid     Mid     Mid     Mid     Mid     Mid [441] Mid     Mid     Mid
> > Mid     Mid     Mid     Old  Old  Old  Old  Old [452] Old  Old  Old
> > Old  Old  Old  Old  Mid     Mid     Old  Old [463] Mid     Mid
> > Old  Old  Mid     Mid     Mid     Mid     Mid     Old  Mid [474] Mid
> > Mid     Old  Mid     Old  Old  Old  Old  Old  Old  Old [485] Mid
> > Mid     Mid     Mid     Mid     Mid     Mid     Mid     Mid     Mid
> > Old [496] Old  Old  Old  Old  Old  Old  Mid     Old  Mid     Old  Old
> > [507] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old [518] Mid
> > Mid     Mid     Mid     Old  Mid     Old  Mid     Old  Mid     Mid
> > [529] Old  Old  Mid     Mid     Mid     Mid     Mid     Mid     Old
> > Mid     Mid [540] Mid     Mid     Mid     Mid     Mid     Mid     Old
> > Old  Old  Old  Mid [551] Mid     Mid     Old  Old  Mid     Mid
> > Old  Mid     Old  Old  Old [562] Old  Mid     Old  Old  Old  Mid
> > Old  Old  Old  Old  Mid [573] Mid     Mid     Old  Old  Mid     Mid
> > Mid     Mid     Old  Old  Old [584] Mid     Old  Old  Old  Old  Old
> > Old  Mid     Mid     Mid     Old [595] Mid     Mid     Mid     Old
> > Old  New Mid     Mid     Old  Mid     Mid [606] Mid     Old  Mid
> > Old  Old  Mid     Mid     Mid     Mid     Mid     Old [617] Mid
> > Old  Old  Old  Old  Old  Old  Old  Old  Old  Old [628] Old  Old  Mid
> > Old  Old  Old  Old  Old  Old  Old  Old [639] Old  Old  Old  Old  Old
> > Old  Old  Old  Old  Old  New [650] Old  Mid     Old  Old  Old  Old
> > Old  Old  Old  Old  Old [661] Old  Old  Old  Old  Old  Old  Old  Old
> > Old  Old  Old [672] Old  Old  New Old  Old  Old  Old  Old  Old  Old
> > Old [683] New Old  Old  Old  Old  Old  Old  Old  Old  Old  Old [694]
> > Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old [705] Old  Old
> > Old  New Old  Old  New Old  Old  Old  Old [716] New New New New New
> > Old  Old  Old  New Old  Old [727] Old  Old  Old  Old  Old  Old  Mid
> > Old  Old  Old  New [738] Old  Old  Old  Old  Old  Old  Old  Old  Old
> > Old  Old [749] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [760] New Old  Old  Old  Old  Old  Old  Old  Old  Old  New
> > [771] Old  Old  Old  Old  Old  Old  Mid     Old  Old  New Old
> > [782] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [793] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [804] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [815] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [826] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [837] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [848] Old  Old  Old  Old  Old  Old  Old  Old  Old  Old  Old
> > [859] Old  Old  Old  Old  Old  Old  Mid     Mid     Old  Old  Old
> > [870] Old  Old
> > Levels: Mid New Old
> >

The fact that MagNew has such a large coefficient and large SE suggests that your model exhibits what some refer to as "complete separation" or "quasi-complete separation" in the data and there is no maximum likelihood estimate for the coefficient.  What does a cross-tabulation of Mag with your DV look like?  You might want to read up on quasi-complete separation and suggestions for dealing with that.

Hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA USA
 



More information about the R-help mailing list