[R-sig-ME] In mer_finalize(ans) : gr cannot be computed at initial par (65)
Douglas Bates
bates at stat.wisc.edu
Tue Jun 22 18:44:30 CEST 2010
On Tue, Jun 22, 2010 at 10:40 AM, Chris Mcowen <cm744 at st-andrews.ac.uk> wrote:
> Hi Douglas and list,
>
> Yeah sure - sorry i should have done that originally,
Thanks for making the data available. As I suspected there was a
problem with factor variables not being encoded as factors but there
was also an interesting twist. When you include the interaction term
there are missing cells in the WOODY_NONWOODY HABIT combinations. (Is
this a consequence of the definition of the habitat levels?) These
missing cells result in a rank-deficient model matrix for the fixed
effects.
The numerical methods used in glm pick this up but those in glmer
don't. See the enclosed output.
This point is often overlooked in discussions of how to speed up
computations in R. Many people feel that using accelerated BLAS
(Basic Linear Algebra Subroutines) will be a panacea but it is exactly
problems like this that necessitate using more careful methods.
>
>
> On 22 Jun 2010, at 16:34, Douglas Bates wrote:
>
> Would it be possible to make the data available so we could check on
> the models being fit?
>
> On Tue, Jun 22, 2010 at 10:14 AM, Chris Mcowen <cm744 at st-andrews.ac.uk> wrote:
>> Dear R List -
>>
>> I am trying to do a relatively simple GLMM but am having a problem.
>>
>> My data is categorical but converted to numeric form in excel ( i.e a=1, b=2 etc) my response is binary and i have two random terms - FAMILY and ORDER.
>>
>> I have been running the model fine calling this -
>>
>> model2 <- lmer(THREAT~1+(1|ORDER/FAMILY) + BREEDING_SYSTEM*LIFE_FORM + WOODY_NONWOODY, family=binomial)
>>
>> Fixed effects:
>> Estimate Std. Error z value Pr(>|z|)
>> (Intercept) 2.62364 1.41554 1.853 0.06382 .
>> BREEDING_SYSTEM -1.24256 0.47227 -2.631 0.00851 **
>> LIFE_FORM -0.64368 0.29026 -2.218 0.02659 *
>> WOODY_NONWOODY 0.51817 0.19548 2.651 0.00803 **
>> BREEDING_SYSTEM:LIFE_FORM 0.22289 0.09842 2.265 0.02353 *
>
> How many different levels of BREEDING_SYSTEM and LIFE_FORM do you
> have? I assume that WOODY_NONWOODY is a binary variable. If the
> others have more than two possible levels, and your saying that Excel
> stored them in numeric form (a = 1, b = 2, etc.) leads me to believe
> that there may be more than two, then you are not fitting an
> appropriate model.
>
>> However i want to know what type of life form, breeding system etc is significant so i called -
>>
>> woodynonwoody <- as.factor(WOODY_NONWOODY)
>> habit <- as.factor(HABIT)
>> breedingsystem <- as.factor(BREEDING_SYSTEM)
>>
>> model3 <- lmer(THREAT~1+(1|ORDER/FAMILY) + breedingsystem + woodynonwoody*habit, family=binomial)
>> Warning message:
>> In mer_finalize(ans) : gr cannot be computed at initial par (65)
>>
>> model3
>> Error in asMethod(object) : matrix is not symmetric [1,2]
>>
>> Interestingly when i try the above call without the interaction term it works
>
> Yes. Generalized linear models and generalized linear mixed models
> can't support a large number of possibly redundant coefficients. You
> need to be careful of your model-building strategy. Starting from the
> most complex model possible and using backward elimination doesn't
> always work.
>
>>
>> Any help would be greatly appreciated
>>
>> Chris
>>
>>
>>> sessionInfo()
>> R version 2.11.1 (2010-05-31)
>> i386-apple-darwin9.8.0
>>
>> locale:
>> [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] perturb_2.03 lme4_0.999375-34 Matrix_0.999375-40 lattice_0.18-8
>>
>> loaded via a namespace (and not attached):
>> [1] grid_2.11.1 nlme_3.1-96 stats4_2.11.1 tools_2.11.1
>>
>>
>> Chris Mcowen
>> PhD Student
>>
>> Room 15
>> Sir Harold Mitchell Building
>> University of St Andrews
>> St Andrews
>> Fife
>> KY16 9TH
>> UK
>> Phone 01334 463381
>>
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
>
>
-------------- next part --------------
R version 2.11.1 (2010-05-31)
Copyright (C) 2010 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(lme4a)
Loading required package: Matrix
Loading required package: lattice
Attaching package: 'Matrix'
The following object(s) are masked from 'package:base':
det
Loading required package: minqa
Loading required package: Rcpp
Attaching package: 'lme4a'
The following object(s) are masked from 'package:stats':
AIC
> dat <- within(read.delim("~/tests/traits.txt"),
+ {
+ FLORAL_SYMMETRY <- factor(FLORAL_SYMMETRY, labels = c("Y", "N"))
+ THREAT <- factor(THREAT, labels = c("Y", "N"))
+ STORAGE_ORGAN <- factor(STORAGE_ORGAN, labels = c("Y", "N"))
+ WOODY <- factor(WOODY_NONWOODY, labels = c("Y", "N"))
+ HABIT <- factor(HABIT)
+ BREEDING_SYSTEM <- factor(BREEDING_SYSTEM)
+ })
> str(dat)
'data.frame': 1344 obs. of 20 variables:
$ ORDER : Factor w/ 9 levels "Alismatales",..: 3 3 3 3 3 3 3 3 3 3 ...
$ FAMILY : Factor w/ 43 levels "Agavaceae","Alliaceae",..: 1 1 1 2 2 2 2 2 2 2 ...
$ GENUS : Factor w/ 501 levels "Acanthochlamys",..: 10 10 10 19 19 19 19 19 19 19 ...
$ SPECIES : Factor w/ 1258 levels "abdulrahmanii",..: 456 591 804 237 103 411 123 172 199 372 ...
$ LIFE_FORM : int 5 5 5 5 5 5 5 5 5 5 ...
$ YES : int 10 10 10 10 10 10 10 10 10 10 ...
$ STORAGE_ORGAN : Factor w/ 2 levels "Y","N": 1 1 1 1 1 1 1 1 1 1 ...
$ BREEDING_SYSTEM : Factor w/ 3 levels "1","2","3": 3 3 3 3 3 3 3 3 3 3 ...
$ POLLEN_DISPERSAL: int 2 2 2 2 2 2 2 2 2 2 ...
$ FRUIT : int 2 2 2 1 1 1 1 1 1 1 ...
$ ENDOSPERM : int 2 2 2 2 2 2 2 2 2 2 ...
$ HABIT : Factor w/ 5 levels "1","2","3","4",..: 3 3 3 3 3 3 3 3 3 3 ...
$ WOODY_NONWOODY : int 2 2 2 2 2 2 2 2 2 2 ...
$ L2_REGIONS : int 2 1 1 2 1 2 1 1 1 1 ...
$ SEASONALITY : int 2 2 2 2 2 2 2 2 2 2 ...
$ ALTITUDE : int 1 2 5 6 6 6 5 5 5 5 ...
$ SEED_FRUIT : int 2 2 2 5 5 5 5 5 5 5 ...
$ FLORAL_SYMMETRY : Factor w/ 2 levels "Y","N": 1 1 1 1 1 1 1 1 1 1 ...
$ THREAT : Factor w/ 2 levels "Y","N": 2 2 1 1 2 2 1 1 1 2 ...
$ WOODY : Factor w/ 2 levels "Y","N": 2 2 2 2 2 2 2 2 2 2 ...
> summary(dat)
ORDER FAMILY GENUS SPECIES
Asparagales :615 Orchidaceae :532 Carex : 48 pallida : 4
Poales :490 Poaceae :246 Bulbophyllum: 38 bracteata: 3
Arecales : 62 Cyperaceae :118 Tillandsia : 23 debilis : 3
Alismatales : 61 Bromeliaceae: 81 Dendrobium : 22 elatum : 3
Zingiberales: 48 Arecaceae : 62 Lepanthes : 22 gracile : 3
Dioscoreales: 22 Araceae : 60 Habenaria : 19 javanica : 3
(Other) : 46 (Other) :245 (Other) :1172 (Other) :1325
LIFE_FORM YES STORAGE_ORGAN BREEDING_SYSTEM
Min. : 1.000 Min. : 1.00 Y:1103 1: 170
1st Qu.: 1.000 1st Qu.:10.00 N: 241 2: 83
Median : 5.000 Median :10.00 3:1091
Mean : 3.704 Mean : 9.91
3rd Qu.: 5.000 3rd Qu.:10.00
Max. :10.000 Max. :10.00
POLLEN_DISPERSAL FRUIT ENDOSPERM HABIT WOODY_NONWOODY
Min. :1.000 Min. :1.000 Min. :1.000 1: 60 Min. :1.000
1st Qu.:1.000 1st Qu.:1.000 1st Qu.:1.000 2: 6 1st Qu.:2.000
Median :2.000 Median :1.000 Median :2.000 3:1230 Median :2.000
Mean :1.674 Mean :1.136 Mean :1.563 4: 38 Mean :1.906
3rd Qu.:2.000 3rd Qu.:1.000 3rd Qu.:2.000 5: 10 3rd Qu.:2.000
Max. :2.000 Max. :2.000 Max. :2.000 Max. :2.000
L2_REGIONS SEASONALITY ALTITUDE SEED_FRUIT
Min. : 1.00 Min. :1.000 Min. :1.000 Min. :1.000
1st Qu.: 1.00 1st Qu.:2.000 1st Qu.:1.000 1st Qu.:2.000
Median : 1.00 Median :2.000 Median :3.000 Median :2.000
Mean : 1.79 Mean :1.963 Mean :3.001 Mean :3.004
3rd Qu.: 2.00 3rd Qu.:2.000 3rd Qu.:5.000 3rd Qu.:4.000
Max. :49.00 Max. :3.000 Max. :6.000 Max. :5.000
FLORAL_SYMMETRY THREAT WOODY
Y:762 Y:659 Y: 126
N:582 N:685 N:1218
> summary(glm1 <- glm(THREAT ~ BREEDING_SYSTEM + WOODY * HABIT, family = binomial, data = dat))
Call:
glm(formula = THREAT ~ BREEDING_SYSTEM + WOODY * HABIT, family = binomial,
data = dat)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.5866 -1.1930 0.9706 1.1619 2.0349
Coefficients: (2 not defined because of singularities)
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.06820 0.40392 -0.169 0.86592
BREEDING_SYSTEM2 -0.29553 0.33997 -0.869 0.38470
BREEDING_SYSTEM3 -0.47137 0.17330 -2.720 0.00653 **
WOODYN -1.57186 0.81875 -1.920 0.05488 .
HABIT2 2.56450 1.16005 2.211 0.02706 *
HABIT3 0.41879 0.50302 0.833 0.40510
HABIT4 0.01707 0.50597 0.034 0.97308
HABIT5 -0.84673 0.87861 -0.964 0.33519
WOODYN:HABIT2 NA NA NA NA
WOODYN:HABIT3 1.72928 0.88892 1.945 0.05173 .
WOODYN:HABIT4 -10.47171 324.74491 -0.032 0.97428
WOODYN:HABIT5 NA NA NA NA
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1862.7 on 1343 degrees of freedom
Residual deviance: 1834.6 on 1334 degrees of freedom
AIC: 1854.6
Number of Fisher Scoring iterations: 11
> summary(glm2 <- glm(THREAT ~ BREEDING_SYSTEM + WOODY + HABIT, family = binomial, data = dat))
Call:
glm(formula = THREAT ~ BREEDING_SYSTEM + WOODY + HABIT, family = binomial,
data = dat)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.6042 -1.1888 0.9707 1.1660 1.7941
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.2804 0.3948 -0.710 0.47762
BREEDING_SYSTEM2 -0.3680 0.3383 -1.088 0.27677
BREEDING_SYSTEM3 -0.4810 0.1734 -2.774 0.00554 **
WOODYN -0.2088 0.3018 -0.692 0.48902
HABIT2 1.4527 0.9438 1.539 0.12374
HABIT3 0.9969 0.4271 2.334 0.01959 *
HABIT4 0.2000 0.4953 0.404 0.68636
HABIT5 -0.6250 0.8743 -0.715 0.47470
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1862.7 on 1343 degrees of freedom
Residual deviance: 1839.9 on 1336 degrees of freedom
AIC: 1855.9
Number of Fisher Scoring iterations: 4
> anova(glm2, glm1)
Analysis of Deviance Table
Model 1: THREAT ~ BREEDING_SYSTEM + WOODY + HABIT
Model 2: THREAT ~ BREEDING_SYSTEM + WOODY * HABIT
Resid. Df Resid. Dev Df Deviance
1 1336 1839.9
2 1334 1834.6 2 5.3305
> pchisq(5.3305, 2, lower = FALSE)
[1] 0.06958196
> print(glmm1 <- glmer(THREAT ~ BREEDING_SYSTEM + WOODY + HABIT + (1|ORDER/FAMILY),
+ dat, binomial, verbose = 1), corr=FALSE)
npt = 5 , n = 2
rhobeg = 0.2 , rhoend = 2e-07
0.020: 8: 1839.93; 0.00000 0.00000
0.0020: 12: 1839.93; 0.00000 0.00000
0.00020: 14: 1839.93; 0.00000 0.00000
2.0e-05: 16: 1839.93; 0.00000 0.00000
2.0e-06: 18: 1839.93; 0.00000 0.00000
2.0e-07: 21: 1839.93; 0.00000 0.00000
At return
26: 1839.9318: 0.00000 1.00000e-07
npt = 16 , n = 10
rhobeg = 0.2905476 , rhoend = 2.905476e-07
0.029: 17: 1839.93; 0.00000 0.00000 -0.280354 -0.367978 -0.480966 -0.208796 1.45274 0.996945 0.200016 -0.624974
0.0029: 33: 1839.93; 0.00000 0.00000 -0.280354 -0.367978 -0.480966 -0.208796 1.45274 0.996945 0.200016 -0.624974
0.00029: 46: 1839.93; 0.00000 0.00000 -0.280354 -0.367978 -0.480966 -0.208796 1.45274 0.996945 0.200016 -0.624974
2.9e-05: 57: 1839.93; 0.00000 0.00000 -0.280354 -0.367978 -0.480966 -0.208796 1.45274 0.996945 0.200016 -0.624974
2.9e-06: 68: 1839.93; 0.00000 0.00000 -0.280354 -0.367978 -0.480966 -0.208796 1.45274 0.996945 0.200016 -0.624974
2.9e-07: 81: 1839.93; 0.00000 0.00000 -0.280354 -0.367978 -0.480966 -0.208796 1.45274 0.996945 0.200016 -0.624974
At return
100: 1839.9318: 0.00000 0.00000 -0.280354 -0.367978 -0.480966 -0.208797 1.45274 0.996946 0.200016 -0.624974
Generalized linear mixed model fit by maximum likelihood ['merMod']
Family: binomial
Formula: THREAT ~ BREEDING_SYSTEM + WOODY + HABIT + (1 | ORDER/FAMILY)
Data: dat
AIC BIC logLik deviance
1859.9318 1911.9659 -919.9659 1839.9318
Random effects:
Groups Name Variance Std.Dev.
FAMILY:ORDER (Intercept) 0 0
ORDER (Intercept) 0 0
Number of obs: 1344, groups: FAMILY:ORDER, 43; ORDER, 9
Fixed effects:
Estimate Std. Error z value
(Intercept) -0.2804 0.3948 -0.710
BREEDING_SYSTEM2 -0.3680 0.3383 -1.088
BREEDING_SYSTEM3 -0.4810 0.1734 -2.774
WOODYN -0.2088 0.3018 -0.692
HABIT2 1.4527 0.9438 1.539
HABIT3 0.9969 0.4271 2.334
HABIT4 0.2000 0.4953 0.404
HABIT5 -0.6250 0.8743 -0.715
>
> proc.time()
user system elapsed
6.800 0.110 6.888
More information about the R-sig-mixed-models
mailing list