[R-sig-ME] Can interaction term cause Estimates and Std. Errors to be too large?

Luciano La Sala lucianolasala at yahoo.com.ar
Sun Mar 29 20:47:04 CEST 2009


Dear R-experts,

I am running version 2.7.1 on Windows Vista. I have small dataset which consists of:

# NestID: nest indicator for each chicken. Siblings sharing the same nest have the same nest indicator.

# Chick: chick indicator consisting of a unique ID for each single chick.

# Year: 2006, 2007. 

# ClutchSize: 1-, 2- , 3-eggs.

# HO: hatching order within each clutch (1, 2, 3 [first, second and third-hatched chick]).

In order to account for lack of independence at the nest level (many 
chicks are nested in nest...), I'd like to run a GLMM with random slopes and intercepts for nests.

My approach to model building was as follows: Variables that had P ≤ 0.20 on their own in an initial bivariate analysis were forced into the multivariable analysis. The general procedure for model selection involved starting from a maximum model based on the bivariate analyses and eliminating terms to achieve a simpler model that only retained the significant main effects and two-way interactions. The model was restricted by stepwise manual elimination of variables using the Akaike Information Criterion (AIC) as a measure of goodness-of-fit. 
Interactions were tested only between main effects which remained in the final model. 

My final model for hatching failure (without testing of interaction between main effects) is:

model <- lmer(Hatching ~ HatchOrder + Year + (1|NestID), family=binomial, 1)

I get the following output: 

best.model <- lmer(Hatching~HatchOrder+Year+(1|NestID), family=binomial, 1)

Generalized linear mixed model fit by the Laplace approximation 
Formula: Hatching2 ~ HatchingOrder + Year1 + (1 | NestID) 
 Data: 1 
 AIC      BIC         logLik      deviance
 167.8    185.3       -78.9        157.8

Random effects:
Groups Name              Variance     Std. Dev.
NestID (Intercept)       1.9682        1.4029  

Number of obs: 247, groups: NestID, 120

Fixed effects:
                                                                                        Estimate   Std. Error   z value   Pr(>|z|)    
(Intercept)  -5.4800    0.8329        -6.579   4.73e-11     ***
HO_Second    1.6344     0.6841         2.389   0.01689      *  
HO_Third     3.3007     0.7162         4.609   4.05e-06     ***
Year2006     2.1169     0.6741         3.140   0.00169      **

So far, so good… but then I fit the same model incorporating interaction between the main effects as follows: 

interaction <-lmer(Hatching~HatchOrder+Year+HatchingOrder*Year+(1|NestID), family=binomial,1)

And I get the following output:

Data: 1 
AIC       BIC       logLik      deviance
157.8     182.3     -71.89      143.8

Random effects:
Groups Name              Variance         Std. Dev.
NestID (Intercept)       155.22            12.459  

Number of obs: 247, groups: NestID, 120

Fixed effects:
                                Estimate Std. Error z value Pr(>|z|)   
(Intercept)                     -13.6158     4.8287 -2.8198  0.00481 **
HO_Second                       -23.1961 36249.1930 -0.0006  0.99949   
HO_Third                          5.6624     2.6823  2.1110  0.03477 * 
Year2006                         -0.9602     6.1245 -0.1568  0.87541   
HO_Second:Year2006               30.2249 36249.1931  0.0008  0.99933   
HO_Third:Year2006                10.5549     5.2232  2.0208  0.04331 * 
 

Correlation of Fixed Effects:
            (Intr) HtchOS HtchOT Y12006 HOS:Y1
HtchngOrdrS  0.000                            
HtchngOrdrT -0.384  0.000                     
Year12006   -0.788  0.000  0.303              
HtOS:Y12006  0.000 -1.000  0.000  0.000       
HtOT:Y12006  0.197  0.000 -0.514 -0.556  0.000

Question 1: I am worried about the overly large values of the Estimate and Std. Error for "HO_Second" and "HO_Second*Year2006" from the second model (with interaction term included).  
So what may me causing such large values? Should I be concerned? If so, how can I solve the problem? Is this an over-fitting problem? 

Question 2: The Estimate for "Year2006" becomes negative in the second model. Any clue as to why this happens? 

Question 3: Should I stick with the simpler model 1 which does not asses interaction? 
 
Thank you in advance for the help! 

Lucho 


      Yahoo! Cocina
Recetas prácticas y comida saludable
http://ar.mujer.yahoo.com/cocina/




More information about the R-sig-mixed-models mailing list