[R-sig-ME] Very large HPDintervals, mcmcsamp, lmer
Colin Wahl
biowahl at gmail.com
Mon Nov 21 23:54:08 CET 2011
Hello all,
I am using lmer to describe patterns in stream variables between
streams draining watersheds with different land uses. I need to make
some kind of determination of significance. I understand that mixed
models cannot provide accurate p values because the denominator df is
unknown. As per D. Bates recommendation in his post "lmer, p-values
and all that" I used mcmcsamp and HPDinterval to get 95% "confidence
intervals." These intervals were very large and suggest that there are
no significant differences in variables where differences seem
obvious. I'll use one of my variables as an example: total stream
nitrogen. There are 12 streams and 24 sites (two riparian types per
stream).
model<-lmer(TN ~ wsh*rip + (1|stream), data=all24)
Data: all24
AIC BIC logLik deviance REMLdev
78.47 90.25 -29.24 65.15 58.47
Random effects:
Groups Name Variance Std.Dev.
stream (Intercept) 4.10443 2.0259
Residual 0.20885 0.4570
Number of obs: 24, groups: stream, 12
Fixed effects:
Estimate Std. Error t value
(Intercept) 6.04887 1.19903 5.045
wshD -4.65847 1.69569 -2.747
wshF -4.85194 1.58617 -3.059
wshG -5.18847 1.89584 -2.737
ripN -0.18540 0.37314 -0.497
wshD:ripN 0.15800 0.52770 0.299
wshF:ripN 0.09825 0.49362 0.199
wshG:ripN 0.00915 0.58998 0.016
Estimates are accurate; intercept(cultivated) watersheds clearly have
a much higher nitrogen concentration than the other watersheds. I use
mcmcsamp and HPD interval to get 95% CIs:
model.mcmc<-mcmcsamp(model, n=1000)
model.HPDi<-HPDinterval(model.mcmc, prob=0.95)
Estimate Std. Error t
lower upper
Cultivated 6.04886663 1.19903192 5.04479200 3.75759240 8.40959042
Developed 1.39040003 1.69568720 -2.74724406 -4.27036665 6.80232811
Forested 1.19692503 1.58617014 -3.05890364 -4.53524163 6.52712926
Grassland 0.86039999 1.89583593 -2.73676987 -4.96308190 6.93964588
Riparia(Cult) 5.86346654 0.37313911 -0.49686587 0.52284572 11.44574598
Riparia(Devel) 1.36300000 0.52769839 0.29941356 -12.15807916 14.22417096
Riparia(Fores) 1.10977501 0.49361665 0.19904123 -11.93267982 13.77391850
Riparia(Grass) 0.68414998 0.58998474 0.01550901 -13.67422625 14.50577622
Considering that the CI are substantially overlapped, I interpret
these CIs to suggest there is no detectable differences between
cultivated streams and other streams. This cant be right.
I have a very elementary understanding of Bayesian statistics, and
have not used mcmc or HPDinterval before. Is it apparent that I've
done something wrong? Is my sample size too small? I hope this isnt
just a problem with calculating CIs from the output, which I'm pretty
sure is done the same way its done with lmer (see below).
Please help; I'm desperate to wrap up this analysis.
Thank you,
Colin Wahl
Graduate Student
Western Wash. University
Bellingham, WA USA
CI recalculation:
I would be confident that CIs for each fixed effect are relative to
the intercept, but the CIs look wrong so I am questioning myself. Is
it true that the HPDintervals for each fixed effect are relative to
the intercept coefficient, just as they are in the initial lmer
output?
i.e...
lower upper
(Intercept) 3.684270 8.626003
wshD -8.256167 -1.259411
wshF -8.469073 -1.852237
wshG -9.338573 -1.498553
ripN -3.309220 3.105756
wshD:ripN -4.412022 4.944243
wshF:ripN -4.452947 4.422371
wshG:ripN -5.510786 4.917613
becomes:
lower upper
(Intercept) 3.6842697 8.626003
wshD -4.5718974 7.366592
wshF -4.7848029 6.773766
wshG -5.6543034 7.127450
ripN 0.3750498 11.731759
ripD -12.2931391 15.416590
ripF -12.5469694 14.301893
ripG -14.4743090 15.150819
Using the following code:
ci<-as.matrix(model.HPDi$fixef)
C=ci[1,]
D=C+ci[2,]
FF=C+ci[3,]
G=C+ci[4,]
CN=C+ci[5,]
DN=(D)+(ci[5,]+ci[6,])
FN=(FF)+(ci[5,]+ci[7,])
GN=(G)+(ci[5,]+ci[8,])
confint<-c(C,D,FF,G,CN,DN,FN,GN)
--
CW
More information about the R-sig-mixed-models
mailing list