[R-sig-ME] MCMCglmm multivariate meta-analysis with covariance

Thu Oct 20 22:04:24 CEST 2016

Hi Jon,

In this instance you are actually fitting 3 group-level random effects; 
1 in the random part of the model, one in residual part of the model and 
one in the argument to mev.

The random effects and the residual effects are confounded and are not 
identifiable in the likelihood. You would be better just fitting the model:

m <- MCMCglmm::MCMCglmm(outcome ~ 1, rcov=~group, 
mev=dat$obsVar,data=dat, prior=list(R=list(V=1, nu=0)))

it is virtually the same as your original model (since the residual 
variance was set close to zero)  but it will iterate faster and mix 
better. In this model you estimate the between-observation (in your case 
between-group) variance after accounting for the sampling variance. 
There are two additional, equivalent, ways of writing this model:

m.a <- MCMCglmm::MCMCglmm(outcome ~ 1, random=~us(sqrt(obsVar)):group, 
rcov=~group,data=dat, prior=list(R=list(V=1, nu=0), G=list(G1=list(V=1, 
fix=1))))

and

Cinv_sparse<-as(diag(1/dat$obsVar), "dgCMatrix")

rownames(Cinv_sparse)<-dat$group

m.b <- MCMCglmm::MCMCglmm(outcome ~ 1, random=~group, 
rcov=~group,data=dat, ginverse=list(group=Cinv_sparse), 
prior=list(R=list(V=1, nu=0), G=list(G1=list(V=1, fix=1))))

It is this latter equivalence that I suggested you exploit in the 
bivariate case. Lets say there are now two rows per group (one for spend 
and one for performance) and you have a column in the data-frame 
indicating response type (lets call it sORp)

I don't know how your 2x2 sampling covariance matrices for each group 
are stored, but lets say they are in list form. As an example, with two 
groups:

C<-list(S1=matrix(c(1,0.5, 0.5,1),2,2), S2=matrix(c(2,0, 0,1),2,2))

Cinv_sparse<-as(list2bdiag(lapply(C,solve)), "dgCMatrix")

Assuming the data are sorted response type with group (i.e. in the same 
order as C) then you can fit

dat$observation<-1:nrow(dat)

rownames(Cinv_sparse)<-1:nrow(dat)

m.biv <- MCMCglmm::MCMCglmm(outcome ~ sORp-1, random=~observation, 
rcov=~us(sORp):group,data=dat, ginverse=list(observation=Cinv_sparse), 
prior=list(R=list(V=diag(2), nu=0), G=list(G1=list(V=1, fix=1))))

The rcov term models between-observation (ie. between group) 
(co)variation not due to sampling error. These variances could be zero, 
but you would have to have a pretty effective way of randomising the 
original (unsummarised) observations between groups.

Cheers,

Jarrod

On 20/10/2016 20:31, Jon Bischof wrote:
> Hi Jarrod,
>
> You are making an excellent point, but I am working in a big data 
> setting where only group aggregates, not individual outcomes, will fit 
> in memory.
>
> In brief: this model estimates the log ratio of advertiser spend and 
> performance between a control and experiment condition. These two 
> outcomes are measured at a click level, but we have far too many 
> clicks to store individually in memory. Instead we want to fit a model 
> to advertiser sufficient statistics, which are the log ratio mean and 
> within variance.
>
> For either of these outcomes individually, I was able to fit the 
> following model and recover the true hyperparameters from fake data:
>
>     # Prior distribution for model
>     kMCMCglmmPrior <- list(R=list(fix=1, V=1e-6),
>     G=list(G1=list(V=1e-1, nu=-2)))
>     # dat: A data.frame with columns 'group', 'outcome', and 'obsVar'
>     m <- MCMCglmm::MCMCglmm(outcome ~ 1, random=~group, mev=dat$obsVar,
>         data=dat, prior=kMCMCglmmPrior)
>
>
> However, for many GLMM implementations (including LMER), the within 
> and between variance are not separately identifiable when aggregate 
> data are provided. This is not true statistically, but the aggregate 
> likelihood requires special handling and cannot be fit from a program 
> that expects one row for each y_ij.
>
> Can MCMCglmm fit my one-dimensional model when R is not fixed and 
> there is only one row per group?
>
> Thanks!
> Jon
>
>
>
> On Thu, Oct 20, 2016 at 11:29 AM, Jarrod Hadfield <j.hadfield at ed.ac.uk 
> <mailto:j.hadfield at ed.ac.uk>> wrote:
>
>     Hi Jon,
>
>     MCMCglmm fits random-effect meta-analysis (I think this is what it
>     is caused) which assumes that even after correcting for sampling
>     errors there will be some 'real' between-observation variance (and
>     in your case covariance). I'm not sure what you are modelling, but
>     at least in the types of data I work with I can't really believe
>     there isn't some  'real' between-observation variance.
>
>     Cheers,
>
>     Jarrod
>
>
>
>
>     On 20/10/2016 19:09, Jon Bischof wrote:
>>     Jarrod,
>>
>>     Thanks for your detailed response! Your understanding of my model
>>     is correct: it's just a single grouping metaanalysis in two
>>     dimensions:
>>
>>     (y_i,1, y_i,2) ~ Normal ( (theta_i,1, theta_i,2), V_i )
>>     (theta_i,1, theta_i,2) ~ Normal ( (mu_1, mu_2), S )
>>
>>     where V_i is a known covariance matrix of measurement error.
>>
>>     As a new user of mcmcglmm I will need some time to experiment
>>     with your idea to confirm that it works. My understanding,
>>     however, was that residual error was specified in the R matrix
>>     and not the G matrix. Do we need to fix R as well? Will we still
>>     be able to estimate S?
>>
>>     Thanks!
>>     Jon
>>
>>     On Tue, Oct 18, 2016 at 11:45 PM, Jarrod Hadfield
>>     <j.hadfield at ed.ac.uk <mailto:j.hadfield at ed.ac.uk>> wrote:
>>
>>         Hi Jon,
>>
>>         If you have the covariance matrix for your observations, then
>>         take its inverse and store it in sparse format:
>>
>>         Cinv_sparse<-as(Cinv, "dgCMatrix")
>>
>>         where Cinv is the inverse in dense format. When you say
>>         multivariate do you mean something like an explicit bivariate
>>         response such that the fixed formula is of the form
>>         cbind(y_1, y_2)~...?  If so you need to organise your data in
>>         long format and pass a single response vector. You can
>>         include a variable that denotes whether the observation is
>>         y_1 or y_2 and use it like "trait", and include a variable
>>         that denotes the original row for the observation and use it
>>         like "units". If we call this second variable "row" then
>>         having fit "row" as a random effect, and pass the argument
>>         ginverse=list(row=Cinv_sparse) to MCMCglmm. You will also
>>         need to fix the "row" variance to one in the prior:
>>
>>         G1=list(V=1, fix=1)
>>
>>         Presumably covariances are only non-zero between observations
>>         from the same original row? If so make sure the sparse Matrix
>>         also represents this: numerical issues during inversion may
>>         cause zero entries to differ slightly from zero and hence not
>>         be represented as zero.
>>
>>         Cheers,
>>
>>         Jarrod
>>
>>
>>
>>
>>         Then you can fit the term ~trait:units
>>
>>
>>
>>
>>
>>         On 19/10/2016 05:41, Jon Bischof wrote:
>>
>>             I'm interested in fitting a multivariate meta-analysis
>>             model with
>>             correlated measurement error. This means fixing the error
>>             to a covariance
>>             matrix per row.
>>
>>             I saw this post
>>             <https://stat.ethz.ch/pipermail/r-sig-mixed-models/2013q2/020180.html
>>             <https://stat.ethz.ch/pipermail/r-sig-mixed-models/2013q2/020180.html>>
>>             on
>>             the mailing list about non-correlated outcomes, but the
>>             noise correlation
>>             is too large to ignore in my use case. Professor Hadfield
>>             implies in the
>>             post that it is possible but "complicated". Does anyone
>>             know how to do it?
>>
>>             Thanks!
>>             Jon Bischof
>>
>>                     [[alternative HTML version deleted]]
>>
>>             _______________________________________________
>>             R-sig-mixed-models at r-project.org
>>             <mailto:R-sig-mixed-models at r-project.org> mailing list
>>             https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>             <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>>
>>
>>
>>         -- 
>>         The University of Edinburgh is a charitable body, registered in
>>         Scotland, with registration number SC005336.
>>
>>
>
>
>     The University of Edinburgh is a charitable body, registered in
>     Scotland, with registration number SC005336.
>
>

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20161020/1671baca/attachment.pl>