[R] lme - problems with model

Mon Feb 23 18:55:41 CET 2004

      Doug's "xtabs" suggests to me that the following might be 
estimable, with data.$Temp <- as.numeric(as.character(data.$temp))

      water*(Temp+I(Temp^2))

      It looks like it should be estimable in "lm", and depending on the 
noise model, it should also be estimable in "lme".  ???

      hope this helps.  spencer graves

Douglas Bates wrote:

>CG Pettersson <cg.pettersson at evp.slu.se> writes:
>
>  
>
>>Thanks a lot for the answer!
>>    
>>
>
>  
>
>>Now, I only have the last one left - How do I get round it?
>>I knew about the missing cells in the design, but didn´t know how lme
>>would react on them.
>>    
>>
>
>  
>
>>In this case, I can remove the water:temp term, but how can I be sure
>>that this is the right thing to do?
>>    
>>
>
>Others may be able to come up with inventive ways of creating the
>model matrix to do this but I would simply compare the additive model
>to the cell means model.
>
>  
>
>>milk = read.table("/tmp/milk.txt", header = TRUE)
>>milk$water = factor(milk$water)
>>milk$temp = factor(milk$temp)
>>milk$WaterTemp = factor(paste(milk$water, milk$temp, sep = '/'))
>>xtabs(~ water + temp, milk)
>>    
>>
>     temp
>water 100 110 120 140
>    1 3   3   3   0  
>    2 3   0   3   3  
>    3 3   3   0   3  
>  
>
>>xtabs(~ WaterTemp, milk)
>>    
>>
>WaterTemp
>1/100 1/110 1/120 2/100 2/120 2/140 3/100 3/110 3/140 
>    3     3     3     3     3     3     3     3     3 
>  
>
>>library(lme4)
>>    
>>
>Loading required package: stats4 
>Loading required package: lattice 
>  
>
>>summary(fm1 <- lme(maill6 ~ water + temp, milk, ~ 1 | rep))
>>    
>>
>Linear mixed-effects model fit by REML
> Data: milk 
>      AIC      BIC   logLik
> 6.155644 14.51182 4.922178
>
>Random effects:
> Groups   Name        Variance Std.Dev.
> rep      (Intercept) 0.021838 0.14778 
> Residual             0.017505 0.13231 
>
>Fixed effects: maill6 ~ water + temp 
>             Estimate Std. Error DF t value  Pr(>|t|)    
>(Intercept)  2.194667   0.103828 19 21.1376 1.162e-14 ***
>water2       0.328000   0.068322 19  4.8008 0.0001243 ***
>water3      -0.045333   0.068322 19 -0.6635 0.5149678    
>temp110      0.078000   0.072467 19  1.0764 0.2952465    
>temp120      0.321333   0.072467 19  4.4342 0.0002847 ***
>temp140      0.350667   0.072467 19  4.8390 0.0001140 ***
>---
>Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 
>
>Correlation of Fixed Effects:
>        (Intr) water2 water3 tmp110 tmp120
>water2  -0.329                            
>water3  -0.329  0.500                     
>temp110 -0.310  0.236  0.000              
>temp120 -0.310  0.000  0.236  0.333       
>temp140 -0.155 -0.236 -0.236  0.333  0.333
>
>Number of Observations: 27
>Number of Groups: 3 
>  
>
>>summary(fm2 <- lme(maill6 ~ WaterTemp, milk, ~ 1 | rep))
>>    
>>
>Linear mixed-effects model fit by REML
> Data: milk 
>      AIC      BIC   logLik
> 14.96052 24.75461 3.519738
>
>Random effects:
> Groups   Name        Variance Std.Dev.
> rep      (Intercept) 0.021862 0.14786 
> Residual             0.017286 0.13148 
>
>Fixed effects: maill6 ~ WaterTemp 
>                 Estimate Std. Error DF t value  Pr(>|t|)    
>(Intercept)     2.1766667  0.1142339 16 19.0545 2.016e-12 ***
>WaterTemp1/110  0.1400000  0.1073502 16  1.3041   0.21064    
>WaterTemp1/120  0.3133333  0.1073502 16  2.9188   0.01004 *  
>WaterTemp2/100  0.2833333  0.1073502 16  2.6393   0.01785 *  
>WaterTemp2/120  0.6933333  0.1073502 16  6.4586 7.897e-06 ***
>WaterTemp2/140  0.7333333  0.1073502 16  6.8312 4.035e-06 ***
>WaterTemp3/100  0.0533333  0.1073502 16  0.4968   0.62608    
>WaterTemp3/110  0.0066667  0.1073502 16  0.0621   0.95125    
>WaterTemp3/140  0.2866667  0.1073502 16  2.6704   0.01676 *  
>---
>Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 
>
>Correlation of Fixed Effects:
>            (Intr) WT1/11 WT1/12 WT2/10 WT2/12 WT2/14 WT3/10 WT3/11
>WtrTmp1/110 -0.470                                                 
>WtrTmp1/120 -0.470  0.500                                          
>WtrTmp2/100 -0.470  0.500  0.500                                   
>WtrTmp2/120 -0.470  0.500  0.500  0.500                            
>WtrTmp2/140 -0.470  0.500  0.500  0.500  0.500                     
>WtrTmp3/100 -0.470  0.500  0.500  0.500  0.500  0.500              
>WtrTmp3/110 -0.470  0.500  0.500  0.500  0.500  0.500  0.500       
>WtrTmp3/140 -0.470  0.500  0.500  0.500  0.500  0.500  0.500  0.500
>
>Number of Observations: 27
>Number of Groups: 3 
>
>Both AIC and BIC, which are on the scale of "smaller is better",
>indicate strong preference for the additive model.  A likelihood ratio
>test would not show significant improvement in the fit of the cell
>means model relative to the additive model.
>
>In general one should be cautious when using likelihood ratio tests on
>the fixed-effects terms but in this case it is probably ok because the
>estimates for the random effects are nearly identical for the two
>models.
>
>  
>
>>Is the lm run without the random term enough for removing water:temp
>>from the lme model, or do I have to do a PROC MIXED run with the
>>random term to make that decision in a case like this? 
>>    
>>
>
>I would use this analysis instead.
>
>  
>
>>Is it possible (for me)  to understand why MIXED accepts the design
>>but not lme? They ought to get the same sort of problems, or have I
>>missed something?
>>    
>>
>
>Because of the way that model matrices are created in SAS, the
>computational methods *must* detect rank deficiencies and compensate
>for them.  Whenever there is a categorical factor in the model the SAS
>model matrix will be rank deficient.  You may be aware that SAS always
>uses a parameterization of a factor where the last level of the factor
>is the "reference" level.  That is not a choice - it is a necessary
>consequence of the way in which the computation is carried out.  It is
>the detection of the rank deficiency and elimination of the column
>where that is detected that causes SAS to eliminate the coefficient
>for the last level in a factor.
>
>The approach used in the S language is to use a set of k-1 "contrasts"
>to generate the columns for terms involving a factor with k levels.
>This automatically creates a full rank model matrix unless you have
>missing cells.  The lm code check for rank deficiency and compensates
>for it.  However we did not build that capability into lme (from the
>nlme package or from the lme4 package) because there are two possible
>explanations for the rank deficiency and in the one case we want to
>circumvent it and in the other case we don't.  It is difficult to
>distinguish between those two cases so we don't even try.
>
>
>  
>
>>-------------------
>>    
>>
>>>CG Pettersson <cg.pettersson at evp.slu.se> writes:
>>>
>>>      
>>>
>>>>Hello all!
>>>>
>>>>I´m working with some training datasets in a SAS-based course,
>>>>        
>>>>
>>trying
>>    
>>
>>>>to do the same things in lme that I do in PROC MIXED. 
>>>>
>>>>Why don´t lme do an analysis on this dataset when I use the model
>>>>water*temp?
>>>>The trouble comes from the water:temp term, as it works with
>>>>water+temp.
>>>>The data are, indeed, assymetric but lm accepts the water:temp
>>>>        
>>>>
>>term
>>    
>>
>>>>giving results in the F-test near what PROC MIXED produces. MIXED
>>>>accepts the model.
>>>>
>>>>The water:temp term can be removed from the model according to the
>>>>F-test in SAS (and to the lm model without any random term). Doing
>>>>        
>>>>
>>so
>>    
>>
>>>>in both MIXED and lme gives reasonably similar results for both
>>>>systems.
>>>>
>>>>What do the error message mean, and how can I get around this?
>>>>        
>>>>
>>>Because of missing cells in the design
>>>
>>>      
>>>
>>>>xtabs(~water + temp, milk)
>>>>        
>>>>
>>>     temp
>>>water 100 110 120 140
>>>    1 3   3   3   0  
>>>    2 3   0   3   3  
>>>    3 3   3   0   3  
>>>
>>>the model matrix for the fixed effects is rank deficient.  In lm the
>>>rank deficiency is detected and appropriate adjustments made
>>>
>>>      
>>>
>>>>coef(summary(lm(maill6 ~ water * temp, milk)))
>>>>        
>>>>
>>>                  Estimate Std. Error    t value     Pr(>|t|)
>>>(Intercept)     2.17666667  0.1142339 19.0544730 2.218661e-13
>>>water2          0.28333333  0.1615511  1.7538308 9.647013e-02
>>>water3          0.05333333  0.1615511  0.3301329 7.451108e-01
>>>temp110         0.14000000  0.1615511  0.8665987 3.975669e-01
>>>temp120         0.31333333  0.1615511  1.9395305 6.827304e-02
>>>temp140         0.23333333  0.1615511  1.4443312 1.658280e-01
>>>water3:temp110 -0.18666667  0.2284678 -0.8170371 4.245898e-01
>>>water2:temp120  0.09666667  0.2284678  0.4231085 6.772282e-01
>>>water2:temp140  0.21666667  0.2284678  0.9483467 3.555125e-01
>>>
>>>Notice that you would expect 6 degrees of freedom for the
>>>      
>>>
>>interaction
>>    
>>
>>>term but only three coefficients are estimated.
>>>
>>>In lme it is much more difficult to compensate for such rank
>>>deficiencies because they could be systematic, like this, or they
>>>could be due to relative precision parameters approaching zero
>>>      
>>>
>>during
>>    
>>
>>>the iterations.  Because of this we just report the error (although
>>>admittedly we could be a bit more explicit about the nature of the
>>>problem - we are reporting the symptom that we detect, not the
>>>probable cause).
>>>
>>>
>>>      
>>>
>>>>The dataset:
>>>>        
>>>>
>>>>>milk
>>>>>          
>>>>>
>>>>   water temp rep maill4 maill6 maill8 taste4 taste6 taste8
>>>>1      1  100   1   2.90   2.13   2.39   10.1   10.0    9.6
>>>>2      1  100   2   2.19   2.20   2.27   11.0    9.3   11.0
>>>>3      1  100   3   2.13   2.20   2.41   10.1    7.0    9.6
>>>>4      1  110   1   2.13   2.34   2.41   11.0   10.5    9.8
>>>>5      1  110   2   2.32   2.27   2.25   11.0   11.3   11.2
>>>>6      1  110   3   2.13   2.34   2.42    9.4   10.7    9.0
>>>>7      1  120   1   2.00   2.49   2.71   11.1   11.2   11.4
>>>>8      1  120   2   2.41   2.49   2.46   11.6   11.7    9.6
>>>>9      1  120   3   2.22   2.49   2.73   10.7   10.3   10.2
>>>>10     2  100   1   2.13   2.41   2.49   11.1   10.8   11.2
>>>>11     2  100   2   2.49   2.34   2.53   11.1   11.2    9.2
>>>>12     2  100   3   2.80   2.63   3.33    8.3    9.7    7.8
>>>>13     2  120   1   2.38   2.85   2.06   11.9   11.2   11.2
>>>>14     2  120   2   2.61   2.70   2.70   11.7   10.8   11.0
>>>>15     2  120   3   2.77   3.06   3.25   10.9    9.0    9.4
>>>>16     2  140   1   2.56   2.84   3.10   10.7   11.2    9.8
>>>>17     2  140   2   2.63   2.61   2.81   10.8   11.0   11.6
>>>>18     2  140   3   2.99   3.28   3.75    9.2    9.6    9.6
>>>>19     3  100   1   2.60   2.24   2.32   10.8    8.4   10.8
>>>>20     3  100   2   2.06   2.11   2.20   11.0   11.2   11.8
>>>>21     3  100   3   1.98   2.34   2.80   10.3   10.2   10.6
>>>>22     3  110   1   1.91   2.06   2.29   11.0   11.4    9.4
>>>>23     3  110   2   1.98   1.98   2.15   10.0   11.8   10.6
>>>>24     3  110   3   1.98   2.51   2.81    9.3    9.2   10.2
>>>>25     3  140   1   2.27   2.42   2.72   10.8   11.6   12.0
>>>>26     3  140   2   2.27   2.20   2.41   11.2   11.0   11.4
>>>>27     3  140   3   2.20   2.77   3.06   10.5   10.2   10.0
>>>>
>>>>The failing model:
>>>>        
>>>>
>>>>>lme(maill6 ~ water * temp  , random= ~1|rep, data = milk)
>>>>>          
>>>>>
>>>>Error in MEEM(object, conLin, control$niterEM) : 
>>>>        Singularity in backsolve at level 0, block 1
>>>>
>>>>The smaller (working) model:
>>>>        
>>>>
>>>>>lme(maill6 ~ water + temp  , random= ~1|rep, data = milk)
>>>>>          
>>>>>
>>>>Linear mixed-effects model fit by REML
>>>>  Data: milk 
>>>>  Log-restricted-likelihood: 4.922178
>>>>  Fixed: maill6 ~ water + temp 
>>>>(Intercept)      water2      water3     temp110     temp120    
>>>>temp140 
>>>> 2.19466667  0.32800000 -0.04533333  0.07800000  0.32133333 
>>>>0.35066667 
>>>>
>>>>Random effects:
>>>> Formula: ~1 | rep
>>>>        (Intercept)  Residual
>>>>StdDev:   0.1477760 0.1323057
>>>>
>>>>Number of Observations: 27
>>>>Number of Groups: 3 
>>>>        
>>>>
>>>>
>>>>
>>>>CG Pettersson, MSci, PhD Stud.
>>>>Swedish University of Agricultural Sciences
>>>>Dep. of Ecology and Crop Production. Box 7043
>>>>SE-750 07 Uppsala
>>>>
>>>>______________________________________________
>>>>R-help at stat.math.ethz.ch mailing list
>>>>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>>>>PLEASE do read the posting guide!
>>>>        
>>>>
>>http://www.R-project.org/posting-guide.html
>>    
>>
>>>-- 
>>>Douglas Bates                            bates at stat.wisc.edu
>>>Statistics Department                    608/262-2598
>>>University of Wisconsin - Madison       
>>>      
>>>
>>http://www.stat.wisc.edu/~bates/
>>    
>>
>>CG Pettersson, MSci, PhD Stud.
>>Swedish University of Agricultural Sciences
>>Dep. of Ecology and Crop Production. Box 7043
>>SE-750 07 Uppsala
>>    
>>
>
>  
>