[R-meta] Differences in calculation of CVR in escalc()

Thu Oct 12 09:55:13 CEST 2017

I've added a note to help(escalc) about this.

A reference showing that the sample mean and variance are independent for normally distributed data? You can find some references here:

https://en.wikipedia.org/wiki/Normal_distribution#Properties

Best,
Wolfgang

-----Original Message-----
From: Samuel Knapp [mailto:samuel.knapp at tum.de] 
Sent: Thursday, 12 October, 2017 9:29
To: Viechtbauer Wolfgang (SP); r-sig-meta-analysis at r-project.org
Cc: Marcel van der Heijden
Subject: Re: [R-meta] Differences in calculation of CVR in escalc()

Hi Wolfgang,

many thanks for your explanation. I think I understand your point now.

However, it would be nice, if this difference in calculation would be 
made clear in the help of escalc. So far, only Nakagawa et al. (2015) is 
cited (even 3 times), which lead me to assume that calculations are done 
as shown in the given reference.

Also, is there any reference with your argument?

Best,

Samuel

On 11/10/17 21:23, Viechtbauer Wolfgang (SP) wrote:
> Hi Samuel,
>
> Taylor's law describes an *empirical* phenomenon that means and variances (or transformations thereof) across studies tend to be associated in certain ways. That is, if the true mean is larger, than the true variance also tends to be larger.
>
> That is again something entirely different than the correlation between the sample mean and sample variance (or transformations thereof) in the bivariate sampling distribution for data from some distribution. We can derive this correlation (so this is not an empirical phenomenon but a purely statistical fact) and for normally distributed data, that correlation is zero.
>
> Across many studies, the sample mean and variance could indeed be correlated even for normally distributed data because of Taylor's law. But then this has nothing to do with the sampling variance. We would capture this correlation by modeling the relationship between the underlying true means and variances in some way.
>
> Best,
> Wolfgang
>
> -----Original Message-----
> From: Samuel Knapp [mailto:samuel.knapp at tum.de]
> Sent: Wednesday, 11 October, 2017 0:00
> To: Viechtbauer Wolfgang (SP); Michael Dewey; r-sig-meta-analysis at r-project.org
> Subject: Re: [R-meta] Differences in calculation of CVR in escalc()
>
> Dear Wolfgang,
>
> that sounds like a tricky problem. I agree with you, that the best (or the worst) assumption about the distribution we can make is normal distribution. However, in observations the mean and sd covary very often (e.g. Döring et al. 2015), which is also the motivation to use CV=sd/mean.
>
> Nagakawa et al. (2015) in their appendix assume normal distribution of the means and sds, but also assume covariation of mean and sd (without giving references, but I guess because of above mentioned observation).
>
> I understand your point about the different kinds of correlation between studies and the bivariate sampling distribution. However, would it not be better to still include the correlation in order to account for the often observed covariation of mean and sd (and still being a good approximation independent of the real distribution), and also with the argument if there is no correlation (because of a assumed normal distribution), it will be estimated to zero and thus have no effect?
>
> Looking forward to your reply!
>
> Best regards,
> Samuel
>
> References: Döring, T.F., Knapp, S., Cohen, J.E., 2015. Taylor’s power law and the stability of crop yields. Field Crops Research 183, 294–302. doi:10.1016/j.fcr.2015.08.005
>
> On 10/10/17 11:38, Viechtbauer Wolfgang (SP) wrote:
> Dear Samuel,
>
> Eq. 12 in Nakagawa et al (2015) is not correct. For normally distributed data, the mean and variance are independent and so are ln(mean) and ln(SD). Hence, the correlation term should be omitted.
>
> For data that cannot be assumed to be (approximately) normally distributed, the mean and variance are no longer independent and then one would need to account for their correlation in the computation of the sampling variance. However, the correlation between the means and variances (or ln(mean) and ln(SD) values) across studies is not the same thing as the correlation within the bivariate sampling distribution. In fact, those are really different things.
>
> One would have to derive what the correlation is depending on the type of distribution one wants to assume for the data. The correct equation would differ for each type of distribution.
>
> Best,
> Wolfgang