[R-sig-ME] MCMCglmm multivariate meta-analysis with covariance

Jon Bischof jbischof.stat at gmail.com
Thu Oct 20 23:51:29 CEST 2016


Jarrod,

First, I want to thank you for your detailed responses---this is extremely
generous and helpful!

I'll get back to you after trying these new model specifications out. The
covariances not due to sampling error (which I call S in my earlier email)
are exactly what I'm trying to estimate! If they are non-zero it means we
have a heterogenous treatment effect across advertisers. The overall mean
mu is important as well, but the posterior of S is the quantity most
difficult to estimate with simple models.

Jon

On Thu, Oct 20, 2016 at 1:04 PM, Jarrod Hadfield <j.hadfield at ed.ac.uk>
wrote:

> Hi Jon,
>
> In this instance you are actually fitting 3 group-level random effects; 1
> in the random part of the model, one in residual part of the model and one
> in the argument to mev.
>
> The random effects and the residual effects are confounded and are not
> identifiable in the likelihood. You would be better just fitting the model:
>
> m <- MCMCglmm::MCMCglmm(outcome ~ 1, rcov=~group, mev=dat$obsVar,data=dat,
> prior=list(R=list(V=1, nu=0)))
>
> it is virtually the same as your original model (since the residual
> variance was set close to zero)  but it will iterate faster and mix better.
> In this model you estimate the between-observation (in your case
> between-group) variance after accounting for the sampling variance. There
> are two additional, equivalent, ways of writing this model:
>
> m.a <- MCMCglmm::MCMCglmm(outcome ~ 1, random=~us(sqrt(obsVar)):group,
> rcov=~group,data=dat, prior=list(R=list(V=1, nu=0), G=list(G1=list(V=1,
> fix=1))))
> and
>
> Cinv_sparse<-as(diag(1/dat$obsVar), "dgCMatrix")
>
> rownames(Cinv_sparse)<-dat$group
>
> m.b <- MCMCglmm::MCMCglmm(outcome ~ 1, random=~group, rcov=~group,data=dat,
> ginverse=list(group=Cinv_sparse), prior=list(R=list(V=1, nu=0), G=list(G1=list(V=1,
> fix=1))))
>
> It is this latter equivalence that I suggested you exploit in the
> bivariate case. Lets say there are now two rows per group (one for spend
> and one for performance) and you have a column in the data-frame indicating
> response type (lets call it sORp)
>
> I don't know how your 2x2 sampling covariance matrices for each group are
> stored, but lets say they are in list form. As an example, with two groups:
>
> C<-list(S1=matrix(c(1,0.5, 0.5,1),2,2), S2=matrix(c(2,0, 0,1),2,2))
>
> Cinv_sparse<-as(list2bdiag(lapply(C,solve)), "dgCMatrix")
>
> Assuming the data are sorted response type with group (i.e. in the same
> order as C) then you can fit
>
> dat$observation<-1:nrow(dat)
>
> rownames(Cinv_sparse)<-1:nrow(dat)
>
> m.biv <- MCMCglmm::MCMCglmm(outcome ~ sORp-1, random=~observation,
> rcov=~us(sORp):group,data=dat, ginverse=list(observation=Cinv_sparse),
> prior=list(R=list(V=diag(2), nu=0), G=list(G1=list(V=1, fix=1))))
>
> The rcov term models between-observation (ie. between group) (co)variation
> not due to sampling error. These variances could be zero, but you would
> have to have a pretty effective way of randomising the original
> (unsummarised) observations between groups.
>
> Cheers,
>
> Jarrod
>
>
>
>
> On 20/10/2016 20:31, Jon Bischof wrote:
>
> Hi Jarrod,
>
> You are making an excellent point, but I am working in a big data setting
> where only group aggregates, not individual outcomes, will fit in memory.
>
> In brief: this model estimates the log ratio of advertiser spend and
> performance between a control and experiment condition. These two outcomes
> are measured at a click level, but we have far too many clicks to store
> individually in memory. Instead we want to fit a model to advertiser
> sufficient statistics, which are the log ratio mean and within variance.
>
> For either of these outcomes individually, I was able to fit the following
> model and recover the true hyperparameters from fake data:
>
> # Prior distribution for model
>> kMCMCglmmPrior <- list(R=list(fix=1, V=1e-6), G=list(G1=list(V=1e-1,
>> nu=-2)))
>> # dat: A data.frame with columns 'group', 'outcome', and 'obsVar'
>> m <- MCMCglmm::MCMCglmm(outcome ~ 1, random=~group, mev=dat$obsVar,
>>                           data=dat, prior=kMCMCglmmPrior)
>
>
> However, for many GLMM implementations (including LMER), the within and
> between variance are not separately identifiable when aggregate data are
> provided. This is not true statistically, but the aggregate likelihood
> requires special handling and cannot be fit from a program that expects one
> row for each y_ij.
>
> Can MCMCglmm fit my one-dimensional model when R is not fixed and there is
> only one row per group?
>
> Thanks!
> Jon
>
>
>
> On Thu, Oct 20, 2016 at 11:29 AM, Jarrod Hadfield <j.hadfield at ed.ac.uk>
> wrote:
>
>> Hi Jon,
>>
>> MCMCglmm fits random-effect meta-analysis (I think this is what it is
>> caused) which assumes that even after correcting for sampling errors there
>> will be some 'real' between-observation variance (and in your case
>> covariance). I'm not sure what you are modelling, but at least in the types
>> of data I work with I can't really believe there isn't some  'real'
>> between-observation variance.
>>
>> Cheers,
>>
>> Jarrod
>>
>>
>>
>>
>> On 20/10/2016 19:09, Jon Bischof wrote:
>>
>> Jarrod,
>>
>> Thanks for your detailed response! Your understanding of my model is
>> correct: it's just a single grouping metaanalysis in two dimensions:
>>
>> (y_i,1, y_i,2) ~ Normal ( (theta_i,1, theta_i,2), V_i )
>> (theta_i,1, theta_i,2) ~ Normal ( (mu_1, mu_2), S )
>>
>> where V_i is a known covariance matrix of measurement error.
>>
>> As a new user of mcmcglmm I will need some time to experiment with your
>> idea to confirm that it works. My understanding, however, was that residual
>> error was specified in the R matrix and not the G matrix. Do we need to fix
>> R as well? Will we still be able to estimate S?
>>
>> Thanks!
>> Jon
>>
>> On Tue, Oct 18, 2016 at 11:45 PM, Jarrod Hadfield <j.hadfield at ed.ac.uk>
>> wrote:
>>
>>> Hi Jon,
>>>
>>> If you have the covariance matrix for your observations, then take its
>>> inverse and store it in sparse format:
>>>
>>> Cinv_sparse<-as(Cinv, "dgCMatrix")
>>>
>>> where Cinv is the inverse in dense format. When you say multivariate do
>>> you mean something like an explicit bivariate response such that the fixed
>>> formula is of the form cbind(y_1, y_2)~...?  If so you need to organise
>>> your data in long format and pass a single response vector. You can include
>>> a variable that denotes whether the observation is y_1 or y_2 and use it
>>> like "trait", and include a variable that denotes the original row for the
>>> observation and use it like "units". If we call this second variable "row"
>>> then having fit "row" as a random effect, and pass the argument
>>> ginverse=list(row=Cinv_sparse) to MCMCglmm. You will also need to fix the
>>> "row" variance to one in the prior:
>>>
>>> G1=list(V=1, fix=1)
>>>
>>> Presumably covariances are only non-zero between observations from the
>>> same original row? If so make sure the sparse Matrix also represents this:
>>> numerical issues during inversion may cause zero entries to differ slightly
>>> from zero and hence not be represented as zero.
>>>
>>> Cheers,
>>>
>>> Jarrod
>>>
>>>
>>>
>>>
>>> Then you can fit the term ~trait:units
>>>
>>>
>>>
>>>
>>>
>>> On 19/10/2016 05:41, Jon Bischof wrote:
>>>
>>>> I'm interested in fitting a multivariate meta-analysis model with
>>>> correlated measurement error. This means fixing the error to a
>>>> covariance
>>>> matrix per row.
>>>>
>>>> I saw this post
>>>> <https://stat.ethz.ch/pipermail/r-sig-mixed-models/2013q2/020180.html>
>>>> on
>>>> the mailing list about non-correlated outcomes, but the noise
>>>> correlation
>>>> is too large to ignore in my use case. Professor Hadfield implies in the
>>>> post that it is possible but "complicated". Does anyone know how to do
>>>> it?
>>>>
>>>> Thanks!
>>>> Jon Bischof
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>
>>>
>>>
>>> --
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>>
>>>
>>
>>
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>>
>
>
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list