[R-sig-ME] Variance components analysis using a GLMM, how to insert a variance-covariance matrix in the model ?

Viechtbauer Wolfgang (STAT) wolfgang.viechtbauer at maastrichtuniversity.nl
Wed Jan 28 18:33:21 CET 2015


Thanks, Jesse. I was about to (cautiously) suggest giving metafor a try.

Indeed, if you have a vector of estimates and a corresponding (approximately) known var-cov matrix, then one can tackle this from a meta-analytic perspective (which, in the end, is nothing else than two-stage multilevel modeling, maybe with a twist here and there). You may find this illustration useful:

http://www.metafor-project.org/doku.php/tips:two_stage_analysis

It compares the two-stage approach to a single mixed-effects model for longitudinal data. If one can use a single mixed-effects model for the analyses, then this is usually better, but this may not be possible in all cases and there may be circumstances where the two-stage approach has advantages.

In the present case, things are even simpler, since there is no additional grouping variable. 

You (Célia) wrote that the response variable is a proportion, but you also wrote logit(phi_t), which sounds like a logit-transformed proportion. So, is Rcov the (estimated) var-cov matrix of the raw or logit transformed proportions?

At any rate, you would then want to use rma.mv() from the metafor package, which allows for an entire var-cov matrix of the estimates as input. So:

library(metafor)

load(url("http://mammal-research.org/data/example.RData"))

### this assumes that Rcov is the var-cov matrix of the raw proportions
dat <- data.frame(yi = marmot$estimates$estimate, time = as.numeric(as.character(marmot$estimates$time)) - 1990)

res <- rma.mv(yi, V=marmot$vc, mods = ~ time, data=dat)
res

var.components.reml(dat$yi, design = model.matrix(~ dat$time), vcv=marmot$vc)

But you probably want something like:

dat$id <- 1:nrow(dat)
res <- rma.mv(yi, V=marmot$vc, mods = ~ time, random = ~ 1 | id, data=dat)
res

plot(dat$time, dat$yi, pch=19, xaxt="n", xlab="Year", ylab="Estimate")
axis(side=1, at=dat$time, labels=dat$time+1990)
abline(res)
lines(dat$time, predict(res)$ci.lb, lty="dotted")
lines(dat$time, predict(res)$ci.ub, lty="dotted")

for a random-effects model. The fit looks pretty decent.

An issue here is that the analysis above is based on a model that assumes that the sampling distributions of the estimates can be approximated with normal distributions. With raw proportions, that may not be sensible, especially when some of the proportions are so close to 1 as in the present case.

Some form of a generalized mixed-effects model may be an alternative that is more suitable, but you have proportions and I see no information about the denominator on which those proportions are based. Also, I don't know how MARK computes that var-cov matrix (I downloaded the "gentle introduction" book from the MARK website, but when I realized it has more than 1000 pages, I quickly abandoned the idea of figuring this out). It may already be based on some normal approximation in the first place. Also, I don't see an obvious way of incorporating a known var-cov matrix into a GLMM, since the mean and variance structures of such models are typically intertwined.

Best,
Wolfgang

--   
Wolfgang Viechtbauer, Ph.D., Statistician   
Department of Psychiatry and Psychology   
School for Mental Health and Neuroscience   
Faculty of Health, Medicine, and Life Sciences   
Maastricht University, P.O. Box 616 (VIJV1)   
6200 MD Maastricht, The Netherlands   
+31 (43) 388-4170 | http://www.wvbauer.com   

> -----Original Message-----
> From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-
> project.org] On Behalf Of Jesse Whittington
> Sent: Wednesday, January 28, 2015 15:02
> To: REZOUKI CELIA p1314815
> Cc: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] Variance components analysis using a GLMM, how to
> insert a variance-covariance matrix in the model ?
> 
> Hi Célia,
> 
> Check out the rma function in the metafor package.  It is similar to the
> RMark var.components.reml function.
> 
> I've used it for similar analyses with derived annual occupancy
> estimates.
> You can input variance associated with each survival estimate - I'm not
> sure if you can include covariance among estimates.
> 
> Here's an example of your model that I ran in RMark and metafor (sorry
> it's
> not self contained).
> 
> # Multi-year occupancy model.
> m <- try(mark(data=d.proc, ddl=d.ddl, model='RDOccupEG',
> model.parameters=list(Epsilon=f.eps.t,  p=f.p), output=F, silent=T,
> delete=T), silent=TRUE)
>  psi <- m$results$derived
> psi$year <- 0:(nrow(psi) - 1)
> vcv <- m$results$derived.vcv
> fixed.mat1 <- model.matrix( ~ 1 + year, data=psi)  # cannot use this with
> only 2 years of data
> 
> # RMark Test for a linear Trend
> m.trend <- try(var.components.reml(theta=psi$estimate, design=fixed.mat1,
> vcv=vcv), silent=TRUE)  # Linear model for significant trend
> 
> # metafor
> m.trend <- rma(yi = estimate, sei = se, mods = ~ year, data = psi)
> 
> Warning:  I'm not a statistician, so I cannot guarentee that what I've
> done
> is correct.
> 
> Jesse Whittington
> Wildlife Biologist
> Banff National Park
> 
> On Wed, Jan 28, 2015 at 2:17 AM, REZOUKI CELIA p1314815 <
> celia.rezouki at etu.univ-lyon1.fr> wrote:
> 
> > Dear list,
> >
> > We are analysing the survival rates of a mammalian species from a
> > capture-mark-recapture protocol. As a biologist, the usual way to
> > proceed is to analyse capture histories (raw data) with a specific
> > software named MARK (http://www.phidot.org/software/mark/) to run
> > capture-mark-recapture analyses.
> >
> > Our problem is to get an estimation of a random effect of time using
> > linear mixed models, not from the observed data, but from a coefficient
> > vector (let's call it 'phi') representing annual estimates of the
> > survival rates, and the empirical variance/covariance matrix (Rcov)
> > obtained from MARK.
> >
> > We would like to use the output of the analyse (phis and Rcov) from
> MARK
> > in a linear mixed-model in R to extract both a variance components and
> > eventually, to model linear effects of different covariates such as
> > time. The response variable being a proportion, it would be best to use
> > a binomial family and hence, a generalized version of the mixed models.
> >
> > The model would look like:
> > - response variable: logit(phi_t), the annual survival estimated from
> MARK
> > - fixed effects : temporal trends (year entered as a covariable)
> > - random effects : variance in survival around the temporal trend
> > - Rcov, the empirical variance/covariance matrix from MARK is known and
> > should be entered into the GLMM.
> >
> > It is unclear to us whether such an analysis is doable in R or not. The
> > closest we found would be to use mcmcglmm but we would need
> confirmation
> > and somes hint to start.
> >
> > In case you want to help, you can get a vector of estimated survival
> > rates along with the empirical variance/covariance matrix returned by
> > MARK from a subsample of our data here:
> >
> > load(url("http://mammal-research.org/data/example.RData"))
> >
> > Any help would be greatly appreciated.
> >
> > Célia
> >
> > --
> >
> > Célia Rezouki
> > PhD student
> >
> > UMR CNRS 5558 - LBBE
> > Biométrie et Biologie Évolutive
> > UCB Lyon 1 - Bât. Grégor Mendel
> > 43 bd du 11 novembre 1918
> > 69622 VILLEURBANNE cedex


More information about the R-sig-mixed-models mailing list