[R-meta] Question regarding Generalized Linear Mixed-effects Model for Meta-analysis

Wed Jan 3 18:59:36 CET 2018

Hi James,

I tried to be clever and derived it myself. But now that I had a bit more time to think about this, I don't think it is applicable for these purposes. The equation gives an estimate of the sampling variance of p if we would repeatedly observe the performance of the same n individuals; that is, under repeated observations, their p_i values would differ, but it assumes that the underlying true probabilities stay the same across repeated observations. But the more appropriate sampling variance would be for repeated observations of n new individuals and their true probabilities would change across repeated observations. The latter type of sampling variance is indeed just estimated by s^2 / n.

So, Aki, please ignore my previous mail. Well, except that you can still analyze ln(p/(1-p)). And the sampling variance of ln(p/(1-p)) would then be estimated with v = 1/(p*(1-p))^2 * s^2 / n.

Best,
Wolfgang

-----Original Message-----
From: James Pustejovsky [mailto:jepusto at gmail.com] 
Sent: Wednesday, 03 January, 2018 15:12
To: Viechtbauer Wolfgang (SP)
Cc: Michael Dewey; Akifumi Yanagisawa; r-sig-meta-analysis at r-project.org
Subject: Re: [R-meta] Question regarding Generalized Linear Mixed-effects Model for Meta-analysis

Wolfgang,

Please forgive me for following up with questions that are pure statistical geekery. Do you have a reference for the formula you gave on estimating the sampling variance of a mean proportion? I haven't seen it before and was curious to know its development. Also, is there a problem with simply using s^2 / n? This is the unbiased variance estimator under simple random sampling, and so I would have thought that it would work adequately here.

Best,
James

On Wed, Jan 3, 2018 at 4:18 AM, Viechtbauer Wolfgang (SP) <wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
Two additions:

1) Estimation of the sampling variance of a mean proportion is a bit more complex.

Assume that in a given study there are n subjects, each of which completes t trials. So, for each subject, there is a proportion, p_i = x_i/t, where x_i denotes the number of 'successes' on the t trials. Let p = sum p_i / n denote the mean proportion and s^2 the variance of the proportions. Then the sampling variance of p can be estimated with:

v = (p*(1-p) - s^2) / (n*t).

So, when meta-analyzing values of p from multiple studies, the sampling variances should be computed in this way.

2) Instead of meta-analyzing values of p directly (which indeed might lead to predicted values outside of the 0-1 range), we can meta-analyze ln(p/(1-p)) values, which are unbounded and back-transformed values will always be in the 0-1 range. The sampling variance of ln(p/(1-p)) can be estimated with:

v = 1/(p*(1-p))^2 * (p*(1-p) - s^2) / (n*t)

Best,
Wolfgang

-----Original Message-----
From: Michael Dewey [mailto:lists at dewey.myzen.co.uk]
Sent: Wednesday, 03 January, 2018 10:59
To: Akifumi Yanagisawa; Viechtbauer Wolfgang (SP)
Cc: r-sig-meta-analysis at r-project.org
Subject: Re: [R-meta] Question regarding Generalized Linear Mixed-effects Model for Meta-analysis

Dear Aki

In that case why not just use the mean and its sampling variance in the
usual way? This may lead to impossible predictions as there will be no
way of specifying that the means are bounded above and below but it may
be the best you can do with what they have published.

Michael

On 02/01/2018 20:48, Akifumi Yanagisawa wrote:
> Thank you for your reply, Wolfgang.
>
> Your guess is right. I do not have a single count out of a total number of trials in each study. What I am using is the mean proportion and SD among the proportions.
>
> I am sad to hear that I cannot use the binomial distribution in glmer() in this case, and weights argument cannot be used as usual weights.
>
> Do you have any ideas on how to deal with this type of data?
>
> Thank you very much.
>
> Best regards,
>
> Aki
>
> On Jan 2, 2018, at 4:03 AM, Viechtbauer Wolfgang (SP) <wolfgang.viechtbauer at maastrichtuniversity.nl<mailto:wolfgang.viechtbauer at maastrichtuniversity.nl>> wrote:
>
> Dear Aki,
>
> Before I could even suggest a modeling approach, I would need to better understand your dependent variable. You say that you have 'proportional data', but then also mention 'means' and 'SDs'. So, it seems to me that you do not have proportions per se (that is, you do not have a single count out of a total number of trials in each study -- which we could indeed model using a binomial GLMM with a logit link).
>
> Maybe you have studies where each participant conducted a number of trials, so that there is a proportion per participant and what is reported is the mean proportion and the SD among the proportions. But now I am just guessing.
>
> In either case, your glmer() syntax doesn't make sense. For a binomial GLMM, the 'weights' argument is used to give the number of trials when the response is the proportion of successes, but you are using 1/vi as weights.
>
> Best,
> Wolfgang
>
> -----Original Message-----
> From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org] On Behalf Of Akifumi Yanagisawa
> Sent: Wednesday, 20 December, 2017 19:57
> To: r-sig-meta-analysis at r-project.org<mailto:r-sig-meta-analysis at r-project.org>
> Subject: [R-meta] Question regarding Generalized Linear Mixed-effects Model for Meta-analysis
>
> Dear all,
>
> I am having some difficulty dealing with proportional data; the dependent variable is learning gain from an activity, in which means and SDs are converted into proportion. The learning gains are nested in each article; each article examined the learning gains from different types of activities and measured the learning gain at different timing (i.e., immediate post and delayed post). The main thing I would like to do is to get the estimated learning gain percentage and its confidence interval for each activity.
>
> Using the rma.mv() function, I noticed that estimation values go over 100% sometimes; then I thought I should use generalized linear mixed effects model. On the metafor’s webpage (http://www.metafor-project.org/doku.php/todo), I found that the rma.glmm() command does not support Multilevel Models so far and suggested using the LME4 package. I have been trying to figure out how to do this by myself, but I am not sure if I am doing this right. I would appreciate it if you could see if my approach is appropriate and answer to some of my questions.
>
> (1) The approach I tried was, (1) calculated variance from means, SDs, and the numbers of participants by using the escalc function, and (2) then I tried ‘results <- glmer (learning_gain ~ ACTIVITY * TEST_TYPE * TEST_TIMING + (1|article_number/participant_group) + (1|TEST_TIMING:participant_group), weights = 1/vi, family = binomial (link = logit))’. I use the sjPlot package for plotting and the emmeans package to get estiamted learning gain percentages.  Does this sound like the proper approach? Are there other options should I add?
>
> (2) Is it possible for me to get I^2 and H^2 values? I would like to know the proportion of variance explained by each the moderator.
>
> (3) Is there anyway I can conduct (a) Test for Residual Heterogeneity and (b) Test of Moderators? If so, which R package would you recommend? I noticed that the anova function does not provide p-values for the test, and the LmerTest package does not work with the glmer function, either.
>
> Any suggestions and comments will be greatly appreciated. Thank you for your help.
>
> Aki