[R] lme - problems with model

Mon Feb 23 18:30:26 CET 2004

CG Pettersson <cg.pettersson at evp.slu.se> writes:

> Thanks a lot for the answer!

> Now, I only have the last one left - How do I get round it?
> I knew about the missing cells in the design, but didn´t know how lme
> would react on them.

> In this case, I can remove the water:temp term, but how can I be sure
> that this is the right thing to do?

Others may be able to come up with inventive ways of creating the
model matrix to do this but I would simply compare the additive model
to the cell means model.

> milk = read.table("/tmp/milk.txt", header = TRUE)
> milk$water = factor(milk$water)
> milk$temp = factor(milk$temp)
> milk$WaterTemp = factor(paste(milk$water, milk$temp, sep = '/'))
> xtabs(~ water + temp, milk)
     temp
water 100 110 120 140
    1 3   3   3   0  
    2 3   0   3   3  
    3 3   3   0   3  
> xtabs(~ WaterTemp, milk)
WaterTemp
1/100 1/110 1/120 2/100 2/120 2/140 3/100 3/110 3/140 
    3     3     3     3     3     3     3     3     3 
> library(lme4)
Loading required package: stats4 
Loading required package: lattice 
> summary(fm1 <- lme(maill6 ~ water + temp, milk, ~ 1 | rep))
Linear mixed-effects model fit by REML
 Data: milk 
      AIC      BIC   logLik
 6.155644 14.51182 4.922178

Random effects:
 Groups   Name        Variance Std.Dev.
 rep      (Intercept) 0.021838 0.14778 
 Residual             0.017505 0.13231 

Fixed effects: maill6 ~ water + temp 
             Estimate Std. Error DF t value  Pr(>|t|)    
(Intercept)  2.194667   0.103828 19 21.1376 1.162e-14 ***
water2       0.328000   0.068322 19  4.8008 0.0001243 ***
water3      -0.045333   0.068322 19 -0.6635 0.5149678    
temp110      0.078000   0.072467 19  1.0764 0.2952465    
temp120      0.321333   0.072467 19  4.4342 0.0002847 ***
temp140      0.350667   0.072467 19  4.8390 0.0001140 ***
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 

Correlation of Fixed Effects:
        (Intr) water2 water3 tmp110 tmp120
water2  -0.329                            
water3  -0.329  0.500                     
temp110 -0.310  0.236  0.000              
temp120 -0.310  0.000  0.236  0.333       
temp140 -0.155 -0.236 -0.236  0.333  0.333

Number of Observations: 27
Number of Groups: 3 
> summary(fm2 <- lme(maill6 ~ WaterTemp, milk, ~ 1 | rep))
Linear mixed-effects model fit by REML
 Data: milk 
      AIC      BIC   logLik
 14.96052 24.75461 3.519738

Random effects:
 Groups   Name        Variance Std.Dev.
 rep      (Intercept) 0.021862 0.14786 
 Residual             0.017286 0.13148 

Fixed effects: maill6 ~ WaterTemp 
                 Estimate Std. Error DF t value  Pr(>|t|)    
(Intercept)     2.1766667  0.1142339 16 19.0545 2.016e-12 ***
WaterTemp1/110  0.1400000  0.1073502 16  1.3041   0.21064    
WaterTemp1/120  0.3133333  0.1073502 16  2.9188   0.01004 *  
WaterTemp2/100  0.2833333  0.1073502 16  2.6393   0.01785 *  
WaterTemp2/120  0.6933333  0.1073502 16  6.4586 7.897e-06 ***
WaterTemp2/140  0.7333333  0.1073502 16  6.8312 4.035e-06 ***
WaterTemp3/100  0.0533333  0.1073502 16  0.4968   0.62608    
WaterTemp3/110  0.0066667  0.1073502 16  0.0621   0.95125    
WaterTemp3/140  0.2866667  0.1073502 16  2.6704   0.01676 *  
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 

Correlation of Fixed Effects:
            (Intr) WT1/11 WT1/12 WT2/10 WT2/12 WT2/14 WT3/10 WT3/11
WtrTmp1/110 -0.470                                                 
WtrTmp1/120 -0.470  0.500                                          
WtrTmp2/100 -0.470  0.500  0.500                                   
WtrTmp2/120 -0.470  0.500  0.500  0.500                            
WtrTmp2/140 -0.470  0.500  0.500  0.500  0.500                     
WtrTmp3/100 -0.470  0.500  0.500  0.500  0.500  0.500              
WtrTmp3/110 -0.470  0.500  0.500  0.500  0.500  0.500  0.500       
WtrTmp3/140 -0.470  0.500  0.500  0.500  0.500  0.500  0.500  0.500

Number of Observations: 27
Number of Groups: 3 

Both AIC and BIC, which are on the scale of "smaller is better",
indicate strong preference for the additive model.  A likelihood ratio
test would not show significant improvement in the fit of the cell
means model relative to the additive model.

In general one should be cautious when using likelihood ratio tests on
the fixed-effects terms but in this case it is probably ok because the
estimates for the random effects are nearly identical for the two
models.

> Is the lm run without the random term enough for removing water:temp
> from the lme model, or do I have to do a PROC MIXED run with the
> random term to make that decision in a case like this? 

I would use this analysis instead.

> Is it possible (for me)  to understand why MIXED accepts the design
> but not lme? They ought to get the same sort of problems, or have I
> missed something?

Because of the way that model matrices are created in SAS, the
computational methods *must* detect rank deficiencies and compensate
for them.  Whenever there is a categorical factor in the model the SAS
model matrix will be rank deficient.  You may be aware that SAS always
uses a parameterization of a factor where the last level of the factor
is the "reference" level.  That is not a choice - it is a necessary
consequence of the way in which the computation is carried out.  It is
the detection of the rank deficiency and elimination of the column
where that is detected that causes SAS to eliminate the coefficient
for the last level in a factor.

The approach used in the S language is to use a set of k-1 "contrasts"
to generate the columns for terms involving a factor with k levels.
This automatically creates a full rank model matrix unless you have
missing cells.  The lm code check for rank deficiency and compensates
for it.  However we did not build that capability into lme (from the
nlme package or from the lme4 package) because there are two possible
explanations for the rank deficiency and in the one case we want to
circumvent it and in the other case we don't.  It is difficult to
distinguish between those two cases so we don't even try.

> -------------------
> > CG Pettersson <cg.pettersson at evp.slu.se> writes:
> > 
> > > Hello all!
> > > 
> > > I´m working with some training datasets in a SAS-based course,
> trying
> > > to do the same things in lme that I do in PROC MIXED. 
> > > 
> > > Why don´t lme do an analysis on this dataset when I use the model
> > > water*temp?
> > > The trouble comes from the water:temp term, as it works with
> > > water+temp.
> > > The data are, indeed, assymetric but lm accepts the water:temp
> term
> > > giving results in the F-test near what PROC MIXED produces. MIXED
> > > accepts the model.
> > > 
> > > The water:temp term can be removed from the model according to the
> > > F-test in SAS (and to the lm model without any random term). Doing
> so
> > > in both MIXED and lme gives reasonably similar results for both
> > > systems.
> > > 
> > > What do the error message mean, and how can I get around this?
> > 
> > Because of missing cells in the design
> > 
> > > xtabs(~water + temp, milk)
> >      temp
> > water 100 110 120 140
> >     1 3   3   3   0  
> >     2 3   0   3   3  
> >     3 3   3   0   3  
> > 
> > the model matrix for the fixed effects is rank deficient.  In lm the
> > rank deficiency is detected and appropriate adjustments made
> > 
> > > coef(summary(lm(maill6 ~ water * temp, milk)))
> >                   Estimate Std. Error    t value     Pr(>|t|)
> > (Intercept)     2.17666667  0.1142339 19.0544730 2.218661e-13
> > water2          0.28333333  0.1615511  1.7538308 9.647013e-02
> > water3          0.05333333  0.1615511  0.3301329 7.451108e-01
> > temp110         0.14000000  0.1615511  0.8665987 3.975669e-01
> > temp120         0.31333333  0.1615511  1.9395305 6.827304e-02
> > temp140         0.23333333  0.1615511  1.4443312 1.658280e-01
> > water3:temp110 -0.18666667  0.2284678 -0.8170371 4.245898e-01
> > water2:temp120  0.09666667  0.2284678  0.4231085 6.772282e-01
> > water2:temp140  0.21666667  0.2284678  0.9483467 3.555125e-01
> > 
> > Notice that you would expect 6 degrees of freedom for the
> interaction
> > term but only three coefficients are estimated.
> > 
> > In lme it is much more difficult to compensate for such rank
> > deficiencies because they could be systematic, like this, or they
> > could be due to relative precision parameters approaching zero
> during
> > the iterations.  Because of this we just report the error (although
> > admittedly we could be a bit more explicit about the nature of the
> > problem - we are reporting the symptom that we detect, not the
> > probable cause).
> > 
> > 
> > > The dataset:
> > > > milk
> > >    water temp rep maill4 maill6 maill8 taste4 taste6 taste8
> > > 1      1  100   1   2.90   2.13   2.39   10.1   10.0    9.6
> > > 2      1  100   2   2.19   2.20   2.27   11.0    9.3   11.0
> > > 3      1  100   3   2.13   2.20   2.41   10.1    7.0    9.6
> > > 4      1  110   1   2.13   2.34   2.41   11.0   10.5    9.8
> > > 5      1  110   2   2.32   2.27   2.25   11.0   11.3   11.2
> > > 6      1  110   3   2.13   2.34   2.42    9.4   10.7    9.0
> > > 7      1  120   1   2.00   2.49   2.71   11.1   11.2   11.4
> > > 8      1  120   2   2.41   2.49   2.46   11.6   11.7    9.6
> > > 9      1  120   3   2.22   2.49   2.73   10.7   10.3   10.2
> > > 10     2  100   1   2.13   2.41   2.49   11.1   10.8   11.2
> > > 11     2  100   2   2.49   2.34   2.53   11.1   11.2    9.2
> > > 12     2  100   3   2.80   2.63   3.33    8.3    9.7    7.8
> > > 13     2  120   1   2.38   2.85   2.06   11.9   11.2   11.2
> > > 14     2  120   2   2.61   2.70   2.70   11.7   10.8   11.0
> > > 15     2  120   3   2.77   3.06   3.25   10.9    9.0    9.4
> > > 16     2  140   1   2.56   2.84   3.10   10.7   11.2    9.8
> > > 17     2  140   2   2.63   2.61   2.81   10.8   11.0   11.6
> > > 18     2  140   3   2.99   3.28   3.75    9.2    9.6    9.6
> > > 19     3  100   1   2.60   2.24   2.32   10.8    8.4   10.8
> > > 20     3  100   2   2.06   2.11   2.20   11.0   11.2   11.8
> > > 21     3  100   3   1.98   2.34   2.80   10.3   10.2   10.6
> > > 22     3  110   1   1.91   2.06   2.29   11.0   11.4    9.4
> > > 23     3  110   2   1.98   1.98   2.15   10.0   11.8   10.6
> > > 24     3  110   3   1.98   2.51   2.81    9.3    9.2   10.2
> > > 25     3  140   1   2.27   2.42   2.72   10.8   11.6   12.0
> > > 26     3  140   2   2.27   2.20   2.41   11.2   11.0   11.4
> > > 27     3  140   3   2.20   2.77   3.06   10.5   10.2   10.0
> > > 
> > > The failing model:
> > > > lme(maill6 ~ water * temp  , random= ~1|rep, data = milk)
> > > Error in MEEM(object, conLin, control$niterEM) : 
> > >         Singularity in backsolve at level 0, block 1
> > > 
> > > The smaller (working) model:
> > > > lme(maill6 ~ water + temp  , random= ~1|rep, data = milk)
> > > Linear mixed-effects model fit by REML
> > >   Data: milk 
> > >   Log-restricted-likelihood: 4.922178
> > >   Fixed: maill6 ~ water + temp 
> > > (Intercept)      water2      water3     temp110     temp120    
> > > temp140 
> > >  2.19466667  0.32800000 -0.04533333  0.07800000  0.32133333 
> > > 0.35066667 
> > > 
> > > Random effects:
> > >  Formula: ~1 | rep
> > >         (Intercept)  Residual
> > > StdDev:   0.1477760 0.1323057
> > > 
> > > Number of Observations: 27
> > > Number of Groups: 3 
> > > > 
> > > 
> > > 
> > > 
> > > 
> > > CG Pettersson, MSci, PhD Stud.
> > > Swedish University of Agricultural Sciences
> > > Dep. of Ecology and Crop Production. Box 7043
> > > SE-750 07 Uppsala
> > > 
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
> > 
> > -- 
> > Douglas Bates                            bates at stat.wisc.edu
> > Statistics Department                    608/262-2598
> > University of Wisconsin - Madison       
> http://www.stat.wisc.edu/~bates/
> > 
> CG Pettersson, MSci, PhD Stud.
> Swedish University of Agricultural Sciences
> Dep. of Ecology and Crop Production. Box 7043
> SE-750 07 Uppsala

-- 
Douglas Bates                            bates at stat.wisc.edu
Statistics Department                    608/262-2598
University of Wisconsin - Madison        http://www.stat.wisc.edu/~bates/