[R-sig-ME] In mer_finalize(ans) : gr cannot be computed at initial par (65)

Douglas Bates bates at stat.wisc.edu
Tue Jun 22 18:44:30 CEST 2010


On Tue, Jun 22, 2010 at 10:40 AM, Chris Mcowen <cm744 at st-andrews.ac.uk> wrote:
> Hi Douglas and list,
>
> Yeah sure - sorry i should have done that originally,

Thanks for making the data available.  As I suspected there was a
problem with factor variables not being encoded as factors but there
was also an interesting twist.  When you include the interaction term
there are missing cells in the WOODY_NONWOODY HABIT combinations. (Is
this a consequence of the definition of the habitat levels?)  These
missing cells result in a rank-deficient model matrix for the fixed
effects.

The numerical methods used in glm pick this up but those in glmer
don't.  See the enclosed output.

This point is often overlooked in discussions of how to speed up
computations in R.  Many people feel that using accelerated BLAS
(Basic Linear Algebra Subroutines) will be a panacea but it is exactly
problems like this that necessitate using more careful methods.

>
>
> On 22 Jun 2010, at 16:34, Douglas Bates wrote:
>
> Would it be possible to make the data available so we could check on
> the models being fit?
>
> On Tue, Jun 22, 2010 at 10:14 AM, Chris Mcowen <cm744 at st-andrews.ac.uk> wrote:
>> Dear R List -
>>
>> I am trying to do a relatively simple GLMM but am having a problem.
>>
>> My data is categorical but converted to numeric form in excel ( i.e a=1, b=2 etc) my response is binary and i have two random terms - FAMILY and ORDER.
>>
>> I have been running the model fine calling this -
>>
>> model2 <- lmer(THREAT~1+(1|ORDER/FAMILY) + BREEDING_SYSTEM*LIFE_FORM + WOODY_NONWOODY, family=binomial)
>>
>> Fixed effects:
>>                          Estimate Std. Error z value Pr(>|z|)
>> (Intercept)                2.62364    1.41554   1.853  0.06382 .
>> BREEDING_SYSTEM           -1.24256    0.47227  -2.631  0.00851 **
>> LIFE_FORM                 -0.64368    0.29026  -2.218  0.02659 *
>> WOODY_NONWOODY             0.51817    0.19548   2.651  0.00803 **
>> BREEDING_SYSTEM:LIFE_FORM  0.22289    0.09842   2.265  0.02353 *
>
> How many different levels of BREEDING_SYSTEM and LIFE_FORM do you
> have?  I assume that WOODY_NONWOODY is a binary variable.  If the
> others have more than two possible levels, and your saying that Excel
> stored them in numeric form (a = 1, b = 2, etc.)  leads me to believe
> that there may be more than two, then you are not fitting an
> appropriate model.
>
>> However i want to know what type of life form, breeding system etc is significant so i called -
>>
>> woodynonwoody <- as.factor(WOODY_NONWOODY)
>> habit <- as.factor(HABIT)
>> breedingsystem <- as.factor(BREEDING_SYSTEM)
>>
>> model3 <- lmer(THREAT~1+(1|ORDER/FAMILY) + breedingsystem + woodynonwoody*habit, family=binomial)
>> Warning message:
>> In mer_finalize(ans) : gr cannot be computed at initial par (65)
>>
>> model3
>> Error in asMethod(object) : matrix is not symmetric [1,2]
>>
>> Interestingly when i try the above call without the interaction term it works
>
> Yes.  Generalized linear models and generalized linear mixed models
> can't support a large number of possibly redundant coefficients.  You
> need to be careful of your model-building strategy.  Starting from the
> most complex model possible and using backward elimination doesn't
> always work.
>
>>
>> Any help would be greatly appreciated
>>
>> Chris
>>
>>
>>> sessionInfo()
>> R version 2.11.1 (2010-05-31)
>> i386-apple-darwin9.8.0
>>
>> locale:
>> [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] perturb_2.03       lme4_0.999375-34   Matrix_0.999375-40 lattice_0.18-8
>>
>> loaded via a namespace (and not attached):
>> [1] grid_2.11.1   nlme_3.1-96   stats4_2.11.1 tools_2.11.1
>>
>>
>> Chris Mcowen
>> PhD Student
>>
>> Room 15
>> Sir Harold Mitchell Building
>> University of St Andrews
>> St Andrews
>> Fife
>> KY16 9TH
>> UK
>> Phone 01334 463381
>>
>>
>>        [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
>
>
-------------- next part --------------

R version 2.11.1 (2010-05-31)
Copyright (C) 2010 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(lme4a)
Loading required package: Matrix
Loading required package: lattice

Attaching package: 'Matrix'

The following object(s) are masked from 'package:base':

    det

Loading required package: minqa
Loading required package: Rcpp

Attaching package: 'lme4a'

The following object(s) are masked from 'package:stats':

    AIC

> dat <- within(read.delim("~/tests/traits.txt"),
+           {
+               FLORAL_SYMMETRY <- factor(FLORAL_SYMMETRY, labels = c("Y", "N"))
+               THREAT <- factor(THREAT, labels = c("Y", "N"))
+               STORAGE_ORGAN <- factor(STORAGE_ORGAN, labels = c("Y", "N"))
+               WOODY <- factor(WOODY_NONWOODY, labels = c("Y", "N"))
+               HABIT <- factor(HABIT)
+               BREEDING_SYSTEM <- factor(BREEDING_SYSTEM)
+           })
> str(dat)
'data.frame':	1344 obs. of  20 variables:
 $ ORDER           : Factor w/ 9 levels "Alismatales",..: 3 3 3 3 3 3 3 3 3 3 ...
 $ FAMILY          : Factor w/ 43 levels "Agavaceae","Alliaceae",..: 1 1 1 2 2 2 2 2 2 2 ...
 $ GENUS           : Factor w/ 501 levels "Acanthochlamys",..: 10 10 10 19 19 19 19 19 19 19 ...
 $ SPECIES         : Factor w/ 1258 levels "abdulrahmanii",..: 456 591 804 237 103 411 123 172 199 372 ...
 $ LIFE_FORM       : int  5 5 5 5 5 5 5 5 5 5 ...
 $ YES             : int  10 10 10 10 10 10 10 10 10 10 ...
 $ STORAGE_ORGAN   : Factor w/ 2 levels "Y","N": 1 1 1 1 1 1 1 1 1 1 ...
 $ BREEDING_SYSTEM : Factor w/ 3 levels "1","2","3": 3 3 3 3 3 3 3 3 3 3 ...
 $ POLLEN_DISPERSAL: int  2 2 2 2 2 2 2 2 2 2 ...
 $ FRUIT           : int  2 2 2 1 1 1 1 1 1 1 ...
 $ ENDOSPERM       : int  2 2 2 2 2 2 2 2 2 2 ...
 $ HABIT           : Factor w/ 5 levels "1","2","3","4",..: 3 3 3 3 3 3 3 3 3 3 ...
 $ WOODY_NONWOODY  : int  2 2 2 2 2 2 2 2 2 2 ...
 $ L2_REGIONS      : int  2 1 1 2 1 2 1 1 1 1 ...
 $ SEASONALITY     : int  2 2 2 2 2 2 2 2 2 2 ...
 $ ALTITUDE        : int  1 2 5 6 6 6 5 5 5 5 ...
 $ SEED_FRUIT      : int  2 2 2 5 5 5 5 5 5 5 ...
 $ FLORAL_SYMMETRY : Factor w/ 2 levels "Y","N": 1 1 1 1 1 1 1 1 1 1 ...
 $ THREAT          : Factor w/ 2 levels "Y","N": 2 2 1 1 2 2 1 1 1 2 ...
 $ WOODY           : Factor w/ 2 levels "Y","N": 2 2 2 2 2 2 2 2 2 2 ...
> summary(dat)
          ORDER              FAMILY             GENUS           SPECIES    
 Asparagales :615   Orchidaceae :532   Carex       :  48   pallida  :   4  
 Poales      :490   Poaceae     :246   Bulbophyllum:  38   bracteata:   3  
 Arecales    : 62   Cyperaceae  :118   Tillandsia  :  23   debilis  :   3  
 Alismatales : 61   Bromeliaceae: 81   Dendrobium  :  22   elatum   :   3  
 Zingiberales: 48   Arecaceae   : 62   Lepanthes   :  22   gracile  :   3  
 Dioscoreales: 22   Araceae     : 60   Habenaria   :  19   javanica :   3  
 (Other)     : 46   (Other)     :245   (Other)     :1172   (Other)  :1325  
   LIFE_FORM           YES        STORAGE_ORGAN BREEDING_SYSTEM
 Min.   : 1.000   Min.   : 1.00   Y:1103        1: 170         
 1st Qu.: 1.000   1st Qu.:10.00   N: 241        2:  83         
 Median : 5.000   Median :10.00                 3:1091         
 Mean   : 3.704   Mean   : 9.91                                
 3rd Qu.: 5.000   3rd Qu.:10.00                                
 Max.   :10.000   Max.   :10.00                                
                                                               
 POLLEN_DISPERSAL     FRUIT         ENDOSPERM     HABIT    WOODY_NONWOODY 
 Min.   :1.000    Min.   :1.000   Min.   :1.000   1:  60   Min.   :1.000  
 1st Qu.:1.000    1st Qu.:1.000   1st Qu.:1.000   2:   6   1st Qu.:2.000  
 Median :2.000    Median :1.000   Median :2.000   3:1230   Median :2.000  
 Mean   :1.674    Mean   :1.136   Mean   :1.563   4:  38   Mean   :1.906  
 3rd Qu.:2.000    3rd Qu.:1.000   3rd Qu.:2.000   5:  10   3rd Qu.:2.000  
 Max.   :2.000    Max.   :2.000   Max.   :2.000            Max.   :2.000  
                                                                          
   L2_REGIONS     SEASONALITY       ALTITUDE       SEED_FRUIT   
 Min.   : 1.00   Min.   :1.000   Min.   :1.000   Min.   :1.000  
 1st Qu.: 1.00   1st Qu.:2.000   1st Qu.:1.000   1st Qu.:2.000  
 Median : 1.00   Median :2.000   Median :3.000   Median :2.000  
 Mean   : 1.79   Mean   :1.963   Mean   :3.001   Mean   :3.004  
 3rd Qu.: 2.00   3rd Qu.:2.000   3rd Qu.:5.000   3rd Qu.:4.000  
 Max.   :49.00   Max.   :3.000   Max.   :6.000   Max.   :5.000  
                                                                
 FLORAL_SYMMETRY THREAT  WOODY   
 Y:762           Y:659   Y: 126  
 N:582           N:685   N:1218  
                                 
                                 
                                 
                                 
                                 
> summary(glm1 <- glm(THREAT ~ BREEDING_SYSTEM + WOODY * HABIT, family = binomial, data = dat))

Call:
glm(formula = THREAT ~ BREEDING_SYSTEM + WOODY * HABIT, family = binomial, 
    data = dat)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.5866  -1.1930   0.9706   1.1619   2.0349  

Coefficients: (2 not defined because of singularities)
                  Estimate Std. Error z value Pr(>|z|)   
(Intercept)       -0.06820    0.40392  -0.169  0.86592   
BREEDING_SYSTEM2  -0.29553    0.33997  -0.869  0.38470   
BREEDING_SYSTEM3  -0.47137    0.17330  -2.720  0.00653 **
WOODYN            -1.57186    0.81875  -1.920  0.05488 . 
HABIT2             2.56450    1.16005   2.211  0.02706 * 
HABIT3             0.41879    0.50302   0.833  0.40510   
HABIT4             0.01707    0.50597   0.034  0.97308   
HABIT5            -0.84673    0.87861  -0.964  0.33519   
WOODYN:HABIT2           NA         NA      NA       NA   
WOODYN:HABIT3      1.72928    0.88892   1.945  0.05173 . 
WOODYN:HABIT4    -10.47171  324.74491  -0.032  0.97428   
WOODYN:HABIT5           NA         NA      NA       NA   
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1862.7  on 1343  degrees of freedom
Residual deviance: 1834.6  on 1334  degrees of freedom
AIC: 1854.6

Number of Fisher Scoring iterations: 11

> summary(glm2 <- glm(THREAT ~ BREEDING_SYSTEM + WOODY + HABIT, family = binomial, data = dat))

Call:
glm(formula = THREAT ~ BREEDING_SYSTEM + WOODY + HABIT, family = binomial, 
    data = dat)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.6042  -1.1888   0.9707   1.1660   1.7941  

Coefficients:
                 Estimate Std. Error z value Pr(>|z|)   
(Intercept)       -0.2804     0.3948  -0.710  0.47762   
BREEDING_SYSTEM2  -0.3680     0.3383  -1.088  0.27677   
BREEDING_SYSTEM3  -0.4810     0.1734  -2.774  0.00554 **
WOODYN            -0.2088     0.3018  -0.692  0.48902   
HABIT2             1.4527     0.9438   1.539  0.12374   
HABIT3             0.9969     0.4271   2.334  0.01959 * 
HABIT4             0.2000     0.4953   0.404  0.68636   
HABIT5            -0.6250     0.8743  -0.715  0.47470   
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1862.7  on 1343  degrees of freedom
Residual deviance: 1839.9  on 1336  degrees of freedom
AIC: 1855.9

Number of Fisher Scoring iterations: 4

> anova(glm2, glm1)
Analysis of Deviance Table

Model 1: THREAT ~ BREEDING_SYSTEM + WOODY + HABIT
Model 2: THREAT ~ BREEDING_SYSTEM + WOODY * HABIT
  Resid. Df Resid. Dev Df Deviance
1      1336     1839.9            
2      1334     1834.6  2   5.3305
> pchisq(5.3305, 2, lower = FALSE)
[1] 0.06958196
> print(glmm1 <- glmer(THREAT ~ BREEDING_SYSTEM + WOODY + HABIT + (1|ORDER/FAMILY),
+                      dat, binomial, verbose = 1), corr=FALSE)
npt = 5 , n =  2 
rhobeg =  0.2 , rhoend =  2e-07 
   0.020:   8:      1839.93; 0.00000  0.00000 
  0.0020:  12:      1839.93; 0.00000  0.00000 
 0.00020:  14:      1839.93; 0.00000  0.00000 
 2.0e-05:  16:      1839.93; 0.00000  0.00000 
 2.0e-06:  18:      1839.93; 0.00000  0.00000 
 2.0e-07:  21:      1839.93; 0.00000  0.00000 
At return
 26:     1839.9318:  0.00000 1.00000e-07
npt = 16 , n =  10 
rhobeg =  0.2905476 , rhoend =  2.905476e-07 
   0.029:  17:      1839.93; 0.00000  0.00000 -0.280354 -0.367978 -0.480966 -0.208796  1.45274 0.996945 0.200016 -0.624974 
  0.0029:  33:      1839.93; 0.00000  0.00000 -0.280354 -0.367978 -0.480966 -0.208796  1.45274 0.996945 0.200016 -0.624974 
 0.00029:  46:      1839.93; 0.00000  0.00000 -0.280354 -0.367978 -0.480966 -0.208796  1.45274 0.996945 0.200016 -0.624974 
 2.9e-05:  57:      1839.93; 0.00000  0.00000 -0.280354 -0.367978 -0.480966 -0.208796  1.45274 0.996945 0.200016 -0.624974 
 2.9e-06:  68:      1839.93; 0.00000  0.00000 -0.280354 -0.367978 -0.480966 -0.208796  1.45274 0.996945 0.200016 -0.624974 
 2.9e-07:  81:      1839.93; 0.00000  0.00000 -0.280354 -0.367978 -0.480966 -0.208796  1.45274 0.996945 0.200016 -0.624974 
At return
100:     1839.9318:  0.00000  0.00000 -0.280354 -0.367978 -0.480966 -0.208797  1.45274 0.996946 0.200016 -0.624974
Generalized linear mixed model fit by maximum likelihood ['merMod']
 Family: binomial 
Formula: THREAT ~ BREEDING_SYSTEM + WOODY + HABIT + (1 | ORDER/FAMILY) 
   Data: dat 
      AIC       BIC    logLik  deviance 
1859.9318 1911.9659 -919.9659 1839.9318 

Random effects:
 Groups       Name        Variance Std.Dev.
 FAMILY:ORDER (Intercept) 0        0       
 ORDER        (Intercept) 0        0       
Number of obs: 1344, groups: FAMILY:ORDER, 43; ORDER, 9

Fixed effects:
                 Estimate Std. Error z value
(Intercept)       -0.2804     0.3948  -0.710
BREEDING_SYSTEM2  -0.3680     0.3383  -1.088
BREEDING_SYSTEM3  -0.4810     0.1734  -2.774
WOODYN            -0.2088     0.3018  -0.692
HABIT2             1.4527     0.9438   1.539
HABIT3             0.9969     0.4271   2.334
HABIT4             0.2000     0.4953   0.404
HABIT5            -0.6250     0.8743  -0.715
> 
> proc.time()
   user  system elapsed 
  6.800   0.110   6.888 


More information about the R-sig-mixed-models mailing list