[R] Problem with Variance Components (and general glmmconfusion)
Toby Gardner
t.gardner at uea.ac.uk
Thu Sep 7 16:22:22 CEST 2006
Dear Dr Bates,
Many thanks for such a useful response to my problem.
Regarding Variance Components .....
The VarCorr function runs fine for lmer objects once the nlme package is
removed.
Regarding the format of the nested random effects for an lmer object, you
said:
In recent versions of lme4 you can use the specification
model2 <- lmer(y ~ 1 + (1|groupA/groupB))
Your version may be correct or not. It depends on what the distinct
levels of groupB correspond to. The version with the / is more
reliable.
This works well. These are environmental data measured at plots within sites
(B), within forests (A).
Here is the model (I have put a dump of the data file used for these
analyses at the end of this email):
> modelusd<-lmer(USD~1 + (1|forest/site))
> summary(modelusd)
Linear mixed-effects model fit by REML
Formula: USD ~ 1 + (1 | forest/site)
AIC BIC logLik MLdeviance REMLdeviance
816.7469 825.7788 -405.3734 815.0236 810.7469
Random effects:
Groups Name Variance Std.Dev.
site:forest (Intercept) 6.2099 2.4920
forest (Intercept) 33.0435 5.7483
Residual 10.4335 3.2301
number of obs: 150, groups: site:forest, 15; forest, 3
Fixed effects:
Estimate Std. Error t value
(Intercept) 9.8033 3.3909 2.8911
And VarCorr confirms the variance components:
> VarCorr(modelusd)
$`site:forest`
1 x 1 Matrix of class "dpoMatrix"
(Intercept)
(Intercept) 6.209851
$forest
1 x 1 Matrix of class "dpoMatrix"
(Intercept)
(Intercept) 33.04345
attr(,"sc")
[1] 3.230100
And following your suggestion I used the HPDinterval to obtain a measure of
error around the random effects:
> MC.modelusd<-mcmcsamp(modelusd, 50000)
> HPDinterval(MC.modelusd)
lower upper
(Intercept) -3.5815626 23.202750
log(sigma^2) 2.1168335 2.594976
log(st:f.(In)) 0.8209994 3.250159
log(frst.(In)) 1.0676778 8.676852
deviance 814.8747050 829.055630
attr(,"Probability")
[1] 0.95
What I am really after are the intra-class correlation coefficients so I can
demonstrate the variability in a given environmental variable at different
spatial scales. I can of course calculate the % variance explained for each
random effect from the summary(lmer). However - and this may be a stupid
question! - but can the intervals for the StDev of the random effects also
just be transformed to intervals of the variance (and then converted to %
values for the intra-class correlation coefficients) by squaring?
Ideally I would like to partition the variance explained by all (three)
spatially nested scales - forest / site / array - where array is the sample
unit. Using lmer produces the model summary I want:
> modelusd2<-lmer(USD~1 + (1|forest/site/array))
> summary(modelusd2)
Linear mixed-effects model fit by REML
Formula: USD ~ 1 + (1 | forest/site/array)
AIC BIC logLik MLdeviance REMLdeviance
818.7469 830.7894 -405.3734 815.0236 810.7469
Random effects:
Groups Name Variance Std.Dev.
array:(site:forest) (Intercept) 7.5559 2.7488
site:forest (Intercept) 6.2099 2.4920
forest (Intercept) 33.0435 5.7484
Residual 2.8776 1.6963
number of obs: 150, groups: array:(site:forest), 150; site:forest, 15;
forest, 3
Fixed effects:
Estimate Std. Error t value
(Intercept) 9.8033 3.3909 2.8911
However - the mcmcsamp process fails
> MC.modelusd2<-mcmcsamp(modelusd2, 50000)
with this error message:
Error: Leading minor of order 1 in downdated X'X is not positive definite
Error in t(.Call(mer_MCMCsamp, object, saveb, n, trans, verbose)) :
unable to find the argument 'x' in selecting a method for
function 't'
>
Am I trying something impossible here?
Regarding GLMMs..(now with species count data, blocking random factors and
multiple fixed factors)
When using lmer I would suggest using method = "Laplace" and perhaps
control = list(usePQL = FALSE, msVerbose = 1) as I mentioned in
another reply to the list a few minutes ago.
This seems to work well, thanks.
With the greatest respect to all concerned, if I could I would like to echo
the request by Martin Maechler on the list a few weeks ago that it would be
extremely useful (especially for newcomers like me - and likely would
greatly reduce the traffic on this list looking at many of the past threads)
if authors of packages were able to be explicit in the help files about how
functions differ (key advantages and disadvantages) from packages offering
otherwise very similar functions (e.g. lmer/glmmML - although the subsequent
comment by Dr Bates on this helped a lot).
Many thanks!
Toby Gardner
platform i386-pc-mingw32
arch i386
os mingw32
system i386, mingw32
status
major 2
minor 3.1
year 2006
month 06
day 01
svn rev 38247
language R
version.string Version 2.3.1 (2006-06-01)
> dump("testdata", file=stdout())
testdata <-
structure(list(forest = structure(as.integer(c(2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)), .Label = c("EUC",
"PF", "SF"), class = "factor"), site = as.integer(c(1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5
)), Array = as.integer(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)), USD = c(13.67,
13, 9.33, 10.67, 11, 10, 11.67, 10.33, 7, 8, 5.67, 7.67, 13.67,
10.33, 10.67, 7, 10.33, 12, 14, 7.67, 6.67, 4.33, 9, 13.67, 11,
19, 14, 19.67, 18.67, 8.33, 4.67, 10, 5, 11, 8, 7.67, 11, 12,
11, 7.67, 13.67, 15.33, 12, 11.33, 14.67, 13.33, 7, 12, 11.33,
11.33, 0.67, 3.33, 2, 2.67, 0.33, 1.33, 1.33, 1, 0.67, 0, 3.33,
3.33, 5.67, 4.67, 1.33, 3.67, 1, 6.33, 3, 1.67, 3.67, 5.67, 5.33,
2.67, 1.67, 2.33, 3.67, 6.67, 5.33, 9, 8, 5.67, 2.67, 0, 4.33,
8, 6, 3.67, 10.67, 9, 0.67, 1, 1.33, 0.67, 3, 1.5, 0.67, 0.33,
7.67, 7.33, 22, 18.67, 16.67, 21.33, 22.33, 22.33, 17.67, 16.33,
19.67, 20.67, 21.33, 17.33, 19, 19.33, 18.33, 18.67, 18.33, 19,
18.33, 19.33, 17.33, 12.33, 9.33, 8.33, 6, 17, 18, 8, 12, 15.67,
6, 1.33, 19, 11.67, 7, 16.33, 16, 14, 10.33, 4, 19, 19.67, 14,
15.33, 14, 6.33, 11.33, 11.67, 14, 15.33)), .Names = c("forest",
"site", "Array", "USD"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24",
"25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35",
"36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46",
"47", "48", "49", "50", "51", "52", "53", "54", "55", "56", "57",
"58", "59", "60", "61", "62", "63", "64", "65", "66", "67", "68",
"69", "70", "71", "72", "73", "74", "75", "76", "77", "78", "79",
"80", "81", "82", "83", "84", "85", "86", "87", "88", "89", "90",
"91", "92", "93", "94", "95", "96", "97", "98", "99", "100",
"101", "102", "103", "104", "105", "106", "107", "108", "109",
"110", "111", "112", "113", "114", "115", "116", "117", "118",
"119", "120", "121", "122", "123", "124", "125", "126", "127",
"128", "129", "130", "131", "132", "133", "134", "135", "136",
"137", "138", "139", "140", "141", "142", "143", "144", "145",
"146", "147", "148", "149", "150"))
----- Original Message -----
From: "Douglas Bates" <bates at stat.wisc.edu>
To: "Toby Gardner" <t.gardner at uea.ac.uk>
Cc: "R-help" <r-help at stat.math.ethz.ch>
Sent: Thursday, September 07, 2006 1:23 AM
Subject: Re: [R] Problem with Variance Components (and general
glmmconfusion)
> On 9/4/06, Toby Gardner <t.gardner at uea.ac.uk> wrote:
>> Dear list,
>>
>> I am having some problems with extracting Variance Components from a
>> random-effects model:
>>
>> I am running a simple random-effects model using lme:
>>
>> model<-lme(y~1,random=~1|groupA/groupB)
>>
>> which returns the output for the StdDev of the Random effects, and model
>> AIC etc as expected.
>>
>> Until yesterday I was using R v. 2.0, and had no problem in calling the
>> variance components of the above model using VarCorr(model), together
>> with their 95% confidence intervals using intervals() - although for some
>> response variables a call to intervals() returns the error: Cannot get
>> confidence intervals on var-cov components: Non-positive definite
>> approximate variance-covariance.
>>
>> I have now installed R v. 2.3.1 and am now experiencing odd behaviour
>> with VarCorr(lme.object), with an error message typically being returned:
>>
>> Error in VarCorr(model) : no direct or inherited method for function
>> 'VarCorr' for this call
>>
>> Is this known to happen? For instance could it be due to the subsequent
>> loading of new packages? (lme4 for instance?).
>
> Yes. Avoid loading lme4 and nlme simultaneously.
>
>>
>> To get around this problem I have tried running the same model using
>> lmer:
>>
>> model2<-lmer(y~1 + (1|groupA) + (1|groupB))
>
> In recent versions of lme4 you can use the specification
>
> model2 <- lmer(y ~ 1 + (1|groupA/groupB))
>
> Your version may be correct or not. It depends on what the distinct
> levels of groupB correspond to. The version with the / is more
> reliable.
>
>>
>> Should this not produce the same model? The variance components are very
>> similar but not identical, making me think that I am doing something
>> wrong. I am also correct in thinking that intervals() does not work with
>> lmer? I get: Error in intervals(model2) : no applicable method for
>> "intervals"
>
> That is correct. Currently there is no intervals method for an lmer
> model. You can use mcmcsamp to get a Markov chain Monte Carlo sample
> to which you can apply HPDinterval from the "coda" package. However,
> these are stochastic intervals so it is best to try on a couple of
> chains to check on the reproducibility or the intervals.
>
>>
>> GLMM
>>
>> I have a general application question - please excuse my ignorance, I am
>> relatively new to this and trying to find a way through the maze. In
>> short I need to compile generalized linear mixed models both for (a)
>> Poisson data and (b) binonial data incorporating a two nested random
>> factors, and I need to be able to extract AIC values as I am taking an
>> information-theoretic approach to model selection. Prior to sending an
>> email to the list I have spent quite a few days reading the background on
>> a number of functions, all of which offer potential for this; glmmML,
>> glmmPQL, lmer, and glmmADMB. I can understand that glmmPQL is unsuitable
>> because there is no way of knowing the maximised likelihood, but is there
>> much difference between the remaining three options? I have seen
>> simulation comparisons published on this list between glmmADMB and
>> glmmPQL and lmer, but it seems these are before the latest release of
>> lmer, and also they do not evaluate glmmML. To a newcomer this myria!
> d !
>> of options is bewildering, can anyone offer advice as to the most robust
>> approach?
>
> Goran can correct me if I am wrong but I don't believe that glmmML can
> be used with multiple levels of random effects.
>
> I'm not sure what the status of glmmADMB is these days. There was
> some controversy regarding the license applied to some of that code a
> while back. I don't know if it has been resolved to everyone's
> satisfaction.
>
> When using lmer I would suggest using method = "Laplace" and perhaps
> control = list(usePQL = FALSE, msVerbose = 1) as I mentioned in
> another reply to the list a few minutes ago.
>
> Let us know how it works out.
>
>>
>> Many thanks for your time and patience,
>>
>> Toby Gardner
>>
>> School of Environmental Sciences
>> University of East Anglia
>> Norwich, NR4 7TJ
>> United Kingdom
>> Email: t.gardner at uea.ac.uk
>> Website: www.uea.ac.uk/~e387495
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list