[R] Components of variance
Spencer Graves
spencer.graves at pdf.com
Sun Jun 26 21:18:30 CEST 2005
Hi, Doug:
Thanks for your reply. I'm keenly aware of many of the issues you've
mentioned. (I've learned much of what I know about this from your
innovation research.)
Besides, "lme" provides one answer to that question, if not a great
one. Moreover, if someone is really concerned, couldn't they use
"simulate.lme" in library(nlme), as you did in ch. 4(?) of Pinheiro and
Bates (2000) Mixed-Effects Models in S and S-Plus (Springer) [Or they
can request the French-language report that Renaud mentioned].
I only tried VarCorr(lmer(...)), because the email from you that
Renaud cided sounded to me like it might be doable. If I really wanted
it, I could read your code. I don't want it that badly.
Thanks again,
spencer graves
Douglas Bates wrote:
> On 6/26/05, Spencer Graves <spencer.graves at pdf.com> wrote:
>
>>Hi, Renaud:
>>
>> Thanks for the link to Doug Bates' recent comment on this.
>>Unfortunately, the information contained therein is not enough for me to
>>easily obtain all the required standard errors and write code to make
>>confidence intervals. To be more specific, I'm able to produce the same
>>answers as before using lmer in place of lme as follows:
>>
>>
>>>library(lme4)
>>>set.seed(2)
>>>lot <- rep(LETTERS[1:3], each=9)
>>>lot.e <- rep(rnorm(3), each=9)
>>>wf <- paste(lot, rep(1:9, each=3))
>>>wf.e <- rep(rnorm(9), each=3)
>>>DF <- data.frame(lot=lot, wafer=wf,
>>
>>+ Vt=lot.e+wf.e+rnorm(27))
>>
>>>(fitr <- lmer(Vt~1+(1|lot)+(1|wafer), DF))
>>
>><snip>
>>Random effects:
>> Groups Name Variance Std.Dev.
>> wafer (Intercept) 0.96054 0.98007
>> lot (Intercept) 1.51434 1.23058
>> Residual 1.34847 1.16124
>># of obs: 27, groups: wafer, 9; lot, 3
>>
>>Fixed effects:
>> Estimate Std. Error DF t value Pr(>|t|)
>>(Intercept) 0.60839 0.81329 26 0.7481 0.4611
>>
>> These numbers agree with what I got with lme. I then followed your
>>suggetion to "http://tolstoy.newcastle.edu.au/R/help/05/06/7291.html".
>>This suggested I try str(VarCorr(fitr)):
>>
>>
>>>vcFit <- VarCorr(fitr)
>>>str(vcFit)
>>
>>Formal class 'VarCorr' [package "Matrix"] with 3 slots
>> ..@ scale : num 1.16
>> ..@ reSumry :List of 2
>> .. ..$ wafer:Formal class 'corrmatrix' [package "Matrix"] with 2 slots
>> .. .. .. ..@ .Data : num [1, 1] 1
>> .. .. .. .. ..- attr(*, "dimnames")=List of 2
>> .. .. .. .. .. ..$ : chr "(Intercept)"
>> .. .. .. .. .. ..$ : chr "(Intercept)"
>> .. .. .. ..@ stdDev: Named num 0.844
>> .. .. .. .. ..- attr(*, "names")= chr "(Intercept)"
>> .. ..$ lot :Formal class 'corrmatrix' [package "Matrix"] with 2 slots
>> .. .. .. ..@ .Data : num [1, 1] 1
>> .. .. .. .. ..- attr(*, "dimnames")=List of 2
>> .. .. .. .. .. ..$ : chr "(Intercept)"
>> .. .. .. .. .. ..$ : chr "(Intercept)"
>> .. .. .. ..@ stdDev: Named num 1.06
>> .. .. .. .. ..- attr(*, "names")= chr "(Intercept)"
>> ..@ useScale: logi TRUE
>>
>> Unfortunately, I've been so far unable to translate this into
>>confidence intervals for the variance components that seem to match
>>intervals(lme(...)). The closest I came was as follows:
>>
>>
>>>1.23058*sqrt(exp(qnorm(c(0.025, 0.975))*vcFit at reSumry$lot at stdDev))
>>
>>[1] 0.4356044 3.4763815
>>
>>>0.98007*sqrt(exp(qnorm(c(0.025, 0.975))*vcFit at reSumry$wafer at stdDev))
>>
>>[1] 0.4286029 2.2410887
>>
>> The discrepancy between these numbers and intervals(lme(...)) is 25%
>>for "lot", which suggest that I'm doing something wrong. Moreover, I
>>don't know how to get a similar interval for "scale", and I don't know
>>how to access the estimated variance components other than manually.
>>
>> Comments?
>> Thanks,
>> spencer graves
>>p.s. R 2.1.0 patched under Windows XP.
>>
>>Renaud Lancelot wrote:
>>
>>
>>>Spencer Graves a écrit :
>>>
>>>
>>>> As far as I know, the best available is lme in library(nlme). For
>>>>more information, see the the following:
>>>>
>>>> Pinheiro and Bates (2000) Mixed-Effects Models in S and S-Plus
>>>>(Springer)
>>>>
>>>> Consider the following example:
>>>>
>>>>
>>>>>set.seed(2)
>>>>>lot <- rep(LETTERS[1:3], each=9)
>>>>>lot.e <- rep(rnorm(3), each=9)
>>>>>wf <- paste(lot, rep(1:9, each=3))
>>>>>wf.e <- rep(rnorm(9), each=3)
>>>>>DF <- data.frame(lot=lot, wafer=wf,
>>>>
>>>>+ Vt=lot.e+wf.e+rnorm(27))
>>>>
>>>>>(fit <- lme(Vt~1, random=~1|lot/wafer, DF))
>>>>
>>>>Linear mixed-effects model fit by REML
>>>> Data: DF
>>>> Log-restricted-likelihood: -48.44022
>>>> Fixed: Vt ~ 1
>>>>(Intercept)
>>>> 0.6083933
>>>>
>>>>Random effects:
>>>> Formula: ~1 | lot
>>>> (Intercept)
>>>>StdDev: 1.230572
>>>>
>>>> Formula: ~1 | wafer %in% lot
>>>> (Intercept) Residual
>>>>StdDev: 0.9801403 1.161218
>>>>
>>>>Number of Observations: 27
>>>>Number of Groups:
>>>> lot wafer %in% lot
>>>> 3 9
>>>>
>>>>>(CI.fit <- intervals(fit))
>>>>
>>>>Approximate 95% confidence intervals
>>>>
>>>> Fixed effects:
>>>> lower est. upper
>>>>(Intercept) -1.100281 0.6083933 2.317068
>>>>attr(,"label")
>>>>[1] "Fixed effects:"
>>>>
>>>> Random Effects:
>>>> Level: lot
>>>> lower est. upper
>>>>sd((Intercept)) 0.3368174 1.230572 4.495931
>>>> Level: wafer
>>>> lower est. upper
>>>>sd((Intercept)) 0.426171 0.9801403 2.254201
>>>>
>>>> Within-group standard error:
>>>> lower est. upper
>>>>0.8378296 1.1612179 1.6094289
>>>>
>>>>>str(CI.fit)
>>>>
>>>>List of 3
>>>> $ fixed : num [1, 1:3] -1.100 0.608 2.317
>>>> ..- attr(*, "dimnames")=List of 2
>>>> .. ..$ : chr "(Intercept)"
>>>> .. ..$ : chr [1:3] "lower" "est." "upper"
>>>> ..- attr(*, "label")= chr "Fixed effects:"
>>>> $ reStruct:List of 2
>>>> ..$ lot :`data.frame': 1 obs. of 3 variables:
>>>> .. ..$ lower: num 0.337
>>>> .. ..$ est. : num 1.23
>>>> .. ..$ upper: num 4.5
>>>> ..$ wafer:`data.frame': 1 obs. of 3 variables:
>>>> .. ..$ lower: num 0.426
>>>> .. ..$ est. : num 0.98
>>>> .. ..$ upper: num 2.25
>>>> ..- attr(*, "label")= chr "Random Effects:"
>>>> $ sigma : atomic [1:3] 0.838 1.161 1.609
>>>> ..- attr(*, "label")= chr "Within-group standard error:"
>>>> - attr(*, "level")= num 0.95
>>>> - attr(*, "class")= chr "intervals.lme"
>>>>
>>>>>diff(log(CI.fit$sigma))
>>>>
>>>> est. upper
>>>>0.32641 0.32641
>>>>
>>>> The last line combined with help for intervals.lme shows that the
>>>>confidence interval for sigma (and doubtless also for lot and wafer
>>>>variance components is based on a normal approximation for the
>>>>distribution of log(sigma).
>>>>
>>>> The state of the art is reflected in "lmer" in library(lme4),
>>>>described in the following:
>>>>
>>>> Doug Bates (2005) "Fitting linear mixed models in R" in R News 5/1
>>>>available from "www.r-project.org" -> Newsletter
>>>>
>>>> However, an "intervals" function is not yet available for "lmer"
>>>>objects.
>>>
>>>
>>>The issue of variance components in lme4 was recently by D. Bates:
>>>
>>>http://tolstoy.newcastle.edu.au/R/help/05/06/7291.html
>>>
>>>Also, a colleague wrote (in French) a nice report - with data and R code
>>>on how to compute variance components, their bias and distribution with
>>>lme (normal approximation , Monte Carlo simulation and delta method).
>>>Let me know if you want it.
>>>
>>>Best,
>>>
>>>Renaud
>>>
>>>
>>>
>>>
>>>> spencer graves
>>>>
>>>>John Sorkin wrote:
>>>>
>>>>
>>>>
>>>>
>>>>>Could someone identify a function that I might use to perform a
>>>>>components of variance analysis? In addition to the variance
>>>>>attributable to each factor, I would also like to obtain the SE of the
>>>>>variances.
>>>>>Thank you,
>>>>>John
>>>>>
>>>>>John Sorkin M.D., Ph.D.
>>>>>Chief, Biostatistics and Informatics
>>>>>Baltimore VA Medical Center GRECC and
>>>>>University of Maryland School of Medicine Claude Pepper OAIC
>>>>>
>>>>>University of Maryland School of Medicine
>>>>>Division of Gerontology
>>>>>Baltimore VA Medical Center
>>>>>10 North Greene Street
>>>>>GRECC (BT/18/GR)
>>>>>Baltimore, MD 21201-1524
>>>>>
>>>>>410-605-7119
>>>>>------ NOTE NEW EMAIL ADDRESS:
>>>>>jsorkin at grecc.umaryland.edu
>
>
> Current versions of lmer do not return a Hessian matrix from the
> optimization step so it would not be easy to create such intervals. I
> do have an analytic gradient from which a Hessian could be derived by
> finite differences but even that wouldn't help without documentation
> on how the profiled deviance is defined and the particular
> parameterization used for the variance components. These are not
> arbitrary - they are the results of a considerable amount of research
> and experimentation but I have not yet written it up. It is not
> overly complicated but it is also not entirely straightforward.
>
> However, I do not regard getting standard errors of the estimates of
> variance components as being important. In fact I think I am doing a
> service to the community by *not* providing them because they are
> inevitably misused.
>
> Think back to your first course in statistics when confidence
> intervals were introduced. You learned that a confidence interval on
> the mean of a population (normally distributed or reasonably close to
> normally distributed) is derived as the sample mean plus or minus a
> multiple of the standard error of the mean. What about a confidence
> interval on the variance of such a population? If you calculated that
> as the estimate of the variance plus or minus a multiple of the
> standard error of that estimate then you probably lost points on that
> part of the assignment or exam because we don't calculate a confidence
> interval in that way. We know that the distribution of S^2 is a
> scaled chi-squared distribution and often very far from symmetric and
> a confidence interval constructed in the manner described above would
> have poor properties and would not reflect the extent of our
> uncertainty about the parameter. And we never discussed at all a
> hypothesis test on whether the population variance was zero.
>
> Now let's go to a much more complicated situation with mixed effects
> models. We know that in the simplest possible case it would be
> extremely misleading to create confidence intervals or perform
> hypothesis tests based on an estimate of a variance and a standard
> error of that estimate but we are perfectly happy to do so in the much
> more complicated situation of variance components or, even worse,
> estimates of covariances. It doesn't make sense. Even though SAS
> PROC MIXED and HLM and MLWiN and Stata and ... all produce such tests
> and such confidence intervals for mixed-effects models they still
> don't make sense. There is no good justification for their use other
> than a vague hand-waving about asymptotics.
>
> Until someone can explain to me a good use of quantities such as
> standard errors of estimates of variances or covariances, I'm not
> going to stop working on all of the other aspects of the code and
> documentation and redirect my effort to producing these. If anyone
> wants to do it they can contact me off-list and I'll explain the
> optimization problem and how to get access to the parameterization and
> the gradient. If you are really energetic I can point you to the
> formulae for an analytic Hessian that I haven't yet had time to
> implement.
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
--
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA
spencer.graves at pdf.com
www.pdf.com <http://www.pdf.com>
Tel: 408-938-4420
Fax: 408-280-7915
More information about the R-help
mailing list