[R-sig-ME] MCMCglmm multivariate meta-analysis with covariance

Thu Oct 20 21:31:04 CEST 2016

Hi Jarrod,

You are making an excellent point, but I am working in a big data setting
where only group aggregates, not individual outcomes, will fit in memory.

In brief: this model estimates the log ratio of advertiser spend and
performance between a control and experiment condition. These two outcomes
are measured at a click level, but we have far too many clicks to store
individually in memory. Instead we want to fit a model to advertiser
sufficient statistics, which are the log ratio mean and within variance.

For either of these outcomes individually, I was able to fit the following
model and recover the true hyperparameters from fake data:

# Prior distribution for model
> kMCMCglmmPrior <- list(R=list(fix=1, V=1e-6), G=list(G1=list(V=1e-1,
> nu=-2)))
> # dat: A data.frame with columns 'group', 'outcome', and 'obsVar'
> m <- MCMCglmm::MCMCglmm(outcome ~ 1, random=~group, mev=dat$obsVar,
>                           data=dat, prior=kMCMCglmmPrior)

However, for many GLMM implementations (including LMER), the within and
between variance are not separately identifiable when aggregate data are
provided. This is not true statistically, but the aggregate likelihood
requires special handling and cannot be fit from a program that expects one
row for each y_ij.

Can MCMCglmm fit my one-dimensional model when R is not fixed and there is
only one row per group?

Thanks!
Jon

On Thu, Oct 20, 2016 at 11:29 AM, Jarrod Hadfield <j.hadfield at ed.ac.uk>
wrote:

> Hi Jon,
>
> MCMCglmm fits random-effect meta-analysis (I think this is what it is
> caused) which assumes that even after correcting for sampling errors there
> will be some 'real' between-observation variance (and in your case
> covariance). I'm not sure what you are modelling, but at least in the types
> of data I work with I can't really believe there isn't some  'real'
> between-observation variance.
>
> Cheers,
>
> Jarrod
>
>
>
>
> On 20/10/2016 19:09, Jon Bischof wrote:
>
> Jarrod,
>
> Thanks for your detailed response! Your understanding of my model is
> correct: it's just a single grouping metaanalysis in two dimensions:
>
> (y_i,1, y_i,2) ~ Normal ( (theta_i,1, theta_i,2), V_i )
> (theta_i,1, theta_i,2) ~ Normal ( (mu_1, mu_2), S )
>
> where V_i is a known covariance matrix of measurement error.
>
> As a new user of mcmcglmm I will need some time to experiment with your
> idea to confirm that it works. My understanding, however, was that residual
> error was specified in the R matrix and not the G matrix. Do we need to fix
> R as well? Will we still be able to estimate S?
>
> Thanks!
> Jon
>
> On Tue, Oct 18, 2016 at 11:45 PM, Jarrod Hadfield <j.hadfield at ed.ac.uk>
> wrote:
>
>> Hi Jon,
>>
>> If you have the covariance matrix for your observations, then take its
>> inverse and store it in sparse format:
>>
>> Cinv_sparse<-as(Cinv, "dgCMatrix")
>>
>> where Cinv is the inverse in dense format. When you say multivariate do
>> you mean something like an explicit bivariate response such that the fixed
>> formula is of the form cbind(y_1, y_2)~...?  If so you need to organise
>> your data in long format and pass a single response vector. You can include
>> a variable that denotes whether the observation is y_1 or y_2 and use it
>> like "trait", and include a variable that denotes the original row for the
>> observation and use it like "units". If we call this second variable "row"
>> then having fit "row" as a random effect, and pass the argument
>> ginverse=list(row=Cinv_sparse) to MCMCglmm. You will also need to fix the
>> "row" variance to one in the prior:
>>
>> G1=list(V=1, fix=1)
>>
>> Presumably covariances are only non-zero between observations from the
>> same original row? If so make sure the sparse Matrix also represents this:
>> numerical issues during inversion may cause zero entries to differ slightly
>> from zero and hence not be represented as zero.
>>
>> Cheers,
>>
>> Jarrod
>>
>>
>>
>>
>> Then you can fit the term ~trait:units
>>
>>
>>
>>
>>
>> On 19/10/2016 05:41, Jon Bischof wrote:
>>
>>> I'm interested in fitting a multivariate meta-analysis model with
>>> correlated measurement error. This means fixing the error to a covariance
>>> matrix per row.
>>>
>>> I saw this post
>>> <https://stat.ethz.ch/pipermail/r-sig-mixed-models/2013q2/020180.html>
>>> on
>>> the mailing list about non-correlated outcomes, but the noise correlation
>>> is too large to ignore in my use case. Professor Hadfield implies in the
>>> post that it is possible but "complicated". Does anyone know how to do
>>> it?
>>>
>>> Thanks!
>>> Jon Bischof
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>>
>
>
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>

	[[alternative HTML version deleted]]