[R] Problem with Variance Components (and general glmmconfusion)
Spencer Graves
spencer.graves at pdf.com
Sun Sep 17 19:30:44 CEST 2006
With minor modifications, I got your example to work. After
reading in your 'testdata', I then made 'site' and 'Array' factors:
testdata$site <- factor(testdata$site)
testdata$Array <- factor(testdata$Array)
# Note: S-Plus and R are case sensitive, so 'array' and 'Array' are two
different things.
Then rather than using 'attach', I used the 'data' argument. It's
extra book keeping, but to me seems cleaner.
library(lme4)
modelusd<-lmer(USD~1 + (1|forest/site), testdata)
MC.modelusd<-mcmcsamp(modelusd, 50000)
modelusd2<-lmer(USD~1 + (1|forest/site/Array), data=testdata)
MC.modelusd2<-mcmcsamp(modelusd2, 50000)
These all worked fine for me; see 'sessionInfo()' below.
However, the second model is nonsense, because you have only one
level of Array within each forest/site combination. Thus, Array is
completely confounded with the residual. The variance decomposition for
modelusd and modelusd2 are identical, except that the Residual Variance
in the first is the sum of the Variances for Array:(site:forest) in the
second. From HPDinterval, I got the following for the two:
> HPDinterval(MC.modelusd)
lower upper
(Intercept) -4.6265645 24.168019
log(sigma^2) 2.1184607 2.595467
log(st:f.(In)) 0.8987962 2.901460
log(frst.(In)) 1.7248681 6.889331
deviance 814.9391134 827.032135
attr(,"Probability")
[1] 0.95
> HPDinterval(MC.modelusd2)
lower upper
(Intercept) -4.6414218 23.653071
log(sigma^2) -16.4372306 2.666573
log(A:(:.(In)) -4.6028284 2.744530
log(st:f.(In)) 0.8660924 2.987095
log(frst.(In)) 1.4793119 6.981519
deviance 814.8988318 827.634986
attr(,"Probability")
[1] 0.95
>
The intervals are similar for site and forest between the two
models. However, the interval for log(sigma^2) is not terribly wide for
the first but is essentially unbounded below in the second for both
log(sigma^2) and log(Variance(Array)): The lower limit for sigma^2 =
exp(-16.44) = 7e-8 and for var(Array) = exp(-4.6) = 0.01.
Regarding intraclass correlation, it's not clear to me what
definition you are using. (Dave Howell notes that there are many
different definitions;
"www.uvm.edu/~dhowell/StatPages/More_Stuff/icc/icc.html".) Whatever
definition you are using, however, there is no easy way to combine
multiple intervals into one with a transformation. Before MCMC, it was
common to use a first order Taylor approximation. That procedure was
great when there was nothing else sensible available, but could be
highly misleading if the original intervals were not symmetric or the
linear approximation not very good over most of the intervals of
interest. Fortunately, with MCMC, it's fairly easy. The documentation
says that the output of 'mcmcsamp' is an object of class 'mcmc'. Using
'str', I find that it looks like it might be a numeric array, which is
confirmed as follows:
> is.array(MC.modelusd)
[1] TRUE
Therefore, I will try to just add columns to that array as follows:
MC.modelusd. <- cbind(MC.modelusd,
sigma2=exp(MC.modelusd[, "log(sigma^2)"]),
site=exp(MC.modelusd[, "log(st:f.(In))"]),
forest=exp(MC.modelusd[, "log(frst.(In))"] ) )
MC.modelusd.a <- cbind(MC.modelusd.,
intraclass.site=MC.modelusd.[, "site"] / (
MC.modelusd.[, "site"]+MC.modelusd.[, "sigma2"]) )
When I tried 'HPDinterval(MC.modelusd.a) at this point, I got an
error message. Therefore, I changed the class and tried again:
class(MC.modelusd.a) <- class(MC.modelusd)
HPDinterval(MC.modelusd.a)
lower upper
(Intercept) -4.626564547 24.1680185
log(sigma^2) 2.118460743 2.5954665
log(st:f.(In)) 0.898796162 2.9014600
log(frst.(In)) 1.724868091 6.8893307
deviance 814.939113359 827.0321354
sigma2 8.198757903 13.2489330
site 1.627992273 15.8475869
forest 0.003788248 650.2006949
intraclass.site 0.173906803 0.6321633
attr(,"Probability")
[1] 0.95
If you follow this example, I believe you should be able to get
other desired "highest posterior density (HPD)" intervals.
Hope this helps.
Spencer Graves
> sessionInfo()
Version 2.3.1 Patched (2006-08-13 r38872)
i386-pc-mingw32
attached base packages:
[1] "methods" "stats" "graphics" "grDevices" "utils" "datasets"
[7] "base"
other attached packages:
coda lme4 Matrix lattice
"0.10-6" "0.995-2" "0.995-16" "0.13-10"
Toby Gardner wrote:
<snip>
> What I am really after are the intra-class correlation coefficients so I can
> demonstrate the variability in a given environmental variable at different
> spatial scales. I can of course calculate the % variance explained for each
> random effect from the summary(lmer). However - and this may be a stupid
> question! - but can the intervals for the StDev of the random effects also
> just be transformed to intervals of the variance (and then converted to %
> values for the intra-class correlation coefficients) by squaring?
>
SG: NO, there is no easy transformation of the intervals.
> Ideally I would like to partition the variance explained by all (three)
> spatially nested scales - forest / site / array - where array is the sample
> unit. Using lmer produces the model summary I want:
>
>
>> modelusd2<-lmer(USD~1 + (1|forest/site/array))
>>
>
>
>> summary(modelusd2)
>>
>
>
>
> Linear mixed-effects model fit by REML
>
> Formula: USD ~ 1 + (1 | forest/site/array)
>
> AIC BIC logLik MLdeviance REMLdeviance
>
> 818.7469 830.7894 -405.3734 815.0236 810.7469
>
> Random effects:
>
> Groups Name Variance Std.Dev.
>
> array:(site:forest) (Intercept) 7.5559 2.7488
>
> site:forest (Intercept) 6.2099 2.4920
>
> forest (Intercept) 33.0435 5.7484
>
> Residual 2.8776 1.6963
>
> number of obs: 150, groups: array:(site:forest), 150; site:forest, 15;
> forest, 3
>
>
>
> Fixed effects:
>
> Estimate Std. Error t value
>
> (Intercept) 9.8033 3.3909 2.8911
>
>
>
> However - the mcmcsamp process fails
>
>
>
>
>> MC.modelusd2<-mcmcsamp(modelusd2, 50000)
>>
>
>
>
> with this error message:
>
>
>
> Error: Leading minor of order 1 in downdated X'X is not positive definite
>
> Error in t(.Call(mer_MCMCsamp, object, saveb, n, trans, verbose)) :
>
> unable to find the argument 'x' in selecting a method for
> function 't'
>
>
>
>
>
> Am I trying something impossible here?
>
>
>
> Regarding GLMMs..(now with species count data, blocking random factors and
> multiple fixed factors)
>
>
>
> When using lmer I would suggest using method = "Laplace" and perhaps
> control = list(usePQL = FALSE, msVerbose = 1) as I mentioned in
> another reply to the list a few minutes ago.
>
>
>
> This seems to work well, thanks.
>
>
>
> With the greatest respect to all concerned, if I could I would like to echo
> the request by Martin Maechler on the list a few weeks ago that it would be
> extremely useful (especially for newcomers like me - and likely would
> greatly reduce the traffic on this list looking at many of the past threads)
> if authors of packages were able to be explicit in the help files about how
> functions differ (key advantages and disadvantages) from packages offering
> otherwise very similar functions (e.g. lmer/glmmML - although the subsequent
> comment by Dr Bates on this helped a lot).
>
>
>
> Many thanks!
>
>
>
> Toby Gardner
>
>
>
> platform i386-pc-mingw32
> arch i386
> os mingw32
> system i386, mingw32
> status
> major 2
> minor 3.1
> year 2006
> month 06
> day 01
> svn rev 38247
> language R
> version.string Version 2.3.1 (2006-06-01)
>
>
>
>
>> dump("testdata", file=stdout())
>>
> testdata <-
> structure(list(forest = structure(as.integer(c(2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)), .Label = c("EUC",
> "PF", "SF"), class = "factor"), site = as.integer(c(1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3,
> 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5,
> 5, 5, 5, 5, 5, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4,
> 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3,
> 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5
> )), Array = as.integer(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
> 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
> 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
> 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
> 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
> 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
> 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
> 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)), USD = c(13.67,
> 13, 9.33, 10.67, 11, 10, 11.67, 10.33, 7, 8, 5.67, 7.67, 13.67,
> 10.33, 10.67, 7, 10.33, 12, 14, 7.67, 6.67, 4.33, 9, 13.67, 11,
> 19, 14, 19.67, 18.67, 8.33, 4.67, 10, 5, 11, 8, 7.67, 11, 12,
> 11, 7.67, 13.67, 15.33, 12, 11.33, 14.67, 13.33, 7, 12, 11.33,
> 11.33, 0.67, 3.33, 2, 2.67, 0.33, 1.33, 1.33, 1, 0.67, 0, 3.33,
> 3.33, 5.67, 4.67, 1.33, 3.67, 1, 6.33, 3, 1.67, 3.67, 5.67, 5.33,
> 2.67, 1.67, 2.33, 3.67, 6.67, 5.33, 9, 8, 5.67, 2.67, 0, 4.33,
> 8, 6, 3.67, 10.67, 9, 0.67, 1, 1.33, 0.67, 3, 1.5, 0.67, 0.33,
> 7.67, 7.33, 22, 18.67, 16.67, 21.33, 22.33, 22.33, 17.67, 16.33,
> 19.67, 20.67, 21.33, 17.33, 19, 19.33, 18.33, 18.67, 18.33, 19,
> 18.33, 19.33, 17.33, 12.33, 9.33, 8.33, 6, 17, 18, 8, 12, 15.67,
> 6, 1.33, 19, 11.67, 7, 16.33, 16, 14, 10.33, 4, 19, 19.67, 14,
> 15.33, 14, 6.33, 11.33, 11.67, 14, 15.33)), .Names = c("forest",
> "site", "Array", "USD"), class = "data.frame", row.names = c("1",
> "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
> "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24",
> "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35",
> "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46",
> "47", "48", "49", "50", "51", "52", "53", "54", "55", "56", "57",
> "58", "59", "60", "61", "62", "63", "64", "65", "66", "67", "68",
> "69", "70", "71", "72", "73", "74", "75", "76", "77", "78", "79",
> "80", "81", "82", "83", "84", "85", "86", "87", "88", "89", "90",
> "91", "92", "93", "94", "95", "96", "97", "98", "99", "100",
> "101", "102", "103", "104", "105", "106", "107", "108", "109",
> "110", "111", "112", "113", "114", "115", "116", "117", "118",
> "119", "120", "121", "122", "123", "124", "125", "126", "127",
> "128", "129", "130", "131", "132", "133", "134", "135", "136",
> "137", "138", "139", "140", "141", "142", "143", "144", "145",
> "146", "147", "148", "149", "150"))
>
>
>
>
>
>
>
> ----- Original Message -----
> From: "Douglas Bates" <bates at stat.wisc.edu>
> To: "Toby Gardner" <t.gardner at uea.ac.uk>
> Cc: "R-help" <r-help at stat.math.ethz.ch>
> Sent: Thursday, September 07, 2006 1:23 AM
> Subject: Re: [R] Problem with Variance Components (and general
> glmmconfusion)
>
>
>
>> On 9/4/06, Toby Gardner <t.gardner at uea.ac.uk> wrote:
>>
>>> Dear list,
>>>
>>> I am having some problems with extracting Variance Components from a
>>> random-effects model:
>>>
>>> I am running a simple random-effects model using lme:
>>>
>>> model<-lme(y~1,random=~1|groupA/groupB)
>>>
>>> which returns the output for the StdDev of the Random effects, and model
>>> AIC etc as expected.
>>>
>>> Until yesterday I was using R v. 2.0, and had no problem in calling the
>>> variance components of the above model using VarCorr(model), together
>>> with their 95% confidence intervals using intervals() - although for some
>>> response variables a call to intervals() returns the error: Cannot get
>>> confidence intervals on var-cov components: Non-positive definite
>>> approximate variance-covariance.
>>>
>>> I have now installed R v. 2.3.1 and am now experiencing odd behaviour
>>> with VarCorr(lme.object), with an error message typically being returned:
>>>
>>> Error in VarCorr(model) : no direct or inherited method for function
>>> 'VarCorr' for this call
>>>
>>> Is this known to happen? For instance could it be due to the subsequent
>>> loading of new packages? (lme4 for instance?).
>>>
>> Yes. Avoid loading lme4 and nlme simultaneously.
>>
>>
>>> To get around this problem I have tried running the same model using
>>> lmer:
>>>
>>> model2<-lmer(y~1 + (1|groupA) + (1|groupB))
>>>
>> In recent versions of lme4 you can use the specification
>>
>> model2 <- lmer(y ~ 1 + (1|groupA/groupB))
>>
>> Your version may be correct or not. It depends on what the distinct
>> levels of groupB correspond to. The version with the / is more
>> reliable.
>>
>>
>>> Should this not produce the same model? The variance components are very
>>> similar but not identical, making me think that I am doing something
>>> wrong. I am also correct in thinking that intervals() does not work with
>>> lmer? I get: Error in intervals(model2) : no applicable method for
>>> "intervals"
>>>
>> That is correct. Currently there is no intervals method for an lmer
>> model. You can use mcmcsamp to get a Markov chain Monte Carlo sample
>> to which you can apply HPDinterval from the "coda" package. However,
>> these are stochastic intervals so it is best to try on a couple of
>> chains to check on the reproducibility or the intervals.
>>
>>
>>> GLMM
>>>
>>> I have a general application question - please excuse my ignorance, I am
>>> relatively new to this and trying to find a way through the maze. In
>>> short I need to compile generalized linear mixed models both for (a)
>>> Poisson data and (b) binonial data incorporating a two nested random
>>> factors, and I need to be able to extract AIC values as I am taking an
>>> information-theoretic approach to model selection. Prior to sending an
>>> email to the list I have spent quite a few days reading the background on
>>> a number of functions, all of which offer potential for this; glmmML,
>>> glmmPQL, lmer, and glmmADMB. I can understand that glmmPQL is unsuitable
>>> because there is no way of knowing the maximised likelihood, but is there
>>> much difference between the remaining three options? I have seen
>>> simulation comparisons published on this list between glmmADMB and
>>> glmmPQL and lmer, but it seems these are before the latest release of
>>> lmer, and also they do not evaluate glmmML. To a newcomer this myria!
>>>
>> d !
>>
>>> of options is bewildering, can anyone offer advice as to the most robust
>>> approach?
>>>
>> Goran can correct me if I am wrong but I don't believe that glmmML can
>> be used with multiple levels of random effects.
>>
>> I'm not sure what the status of glmmADMB is these days. There was
>> some controversy regarding the license applied to some of that code a
>> while back. I don't know if it has been resolved to everyone's
>> satisfaction.
>>
>> When using lmer I would suggest using method = "Laplace" and perhaps
>> control = list(usePQL = FALSE, msVerbose = 1) as I mentioned in
>> another reply to the list a few minutes ago.
>>
>> Let us know how it works out.
>>
>>
>>> Many thanks for your time and patience,
>>>
>>> Toby Gardner
>>>
>>> School of Environmental Sciences
>>> University of East Anglia
>>> Norwich, NR4 7TJ
>>> United Kingdom
>>> Email: t.gardner at uea.ac.uk
>>> Website: www.uea.ac.uk/~e387495
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list