[R] Estimate average standard deviation of mean of two dependent groups
Joshua Wiley
jwiley.psych at gmail.com
Wed Aug 25 17:18:01 CEST 2010
Hi Jokel,
If I remember correctly
1) The variance of the sum of two variables X and Y is:
var(X) + var(Y) + (2 * cov(X, Y))
If you create some sample data, you can verify this by:
var((X + Y))
2) The variance of a random variable (X) multiplied by some constant (K)
is equal to the the variance of X multiplied by the square of K. So:
var(X * K) = var(X) * K^2
I have never seen these combined, but I imagine they could be.
Let Z = X + Y
The mean of X and Y is (X + Y)/2 = Z/2
>From the formula above (1), the variance of Z is:
var(X) + var(Y) + (2 * cov(X, Y))
Because Z/2 = Z * 0.5, the variance of Z * 0.5 is given by:
var(Z) * (0.5^2)
and substituting, the variance of the mean of X and Y is:
(var(X) + var(Y) + (2 * cov(X, Y))) * (0.5^2)
This held up under several empirical tests I did where I had actual
data for X and Y.
***Discalimer****
I am piecing this together, and I am far from an expert on the
subject. I do not know if it is actually true, and there may be
additional assumptions.
I tested this using:
myfun <- function() {
x <- rnorm(100)
y <- rnorm(100)
var.x <- var(x)
var.y <- var(y)
cov.xy <- cov(x, y)
calc.var.xplusy <- var.x + var.y + 2*cov.xy
var.meanxy <- calc.var.xplusy * (.5^2)
empirical.var.meanxy <- var( (x + y)/2 )
output <- all.equal(var.meanxy, empirical.var.meanxy)
return(output)
}
temp <- vector("logical", 1000)
for(i in 1:1000) {temp[i] <- myfun()}
all(temp)
HTH,
Josh
On Wed, Aug 25, 2010 at 5:29 AM, Jokel Meyer <jokel.meyer at googlemail.com> wrote:
> Dear R-experts!
>
> I am currently running a meta-analysis with the help of the great metafor
> package. However I have some difficulties setting up my raw data to enter it
> to the meta-analysis models.
> I have a group of subjects that have been measured in two continuous
> variables (A & B). I have the information about the mean of the two
> variables, the group size (n) and the standard deviations of the two
> variables.
> Now I would like to average both variables (A & B) and get the mean and
> standard deviation of the merged variable (C).
> As for the mean this would be quiet easy: I would just take the mean of mean
> A and mean B to get the mean of C.
> However for the standard deviation this seems more tricky as it is to assume
> that standard deviations in A & B correlate. I assume (based on further
> analysis) a correlation of r =0.5.
> I found the formula to get the standard deviation of the SUM (not the mean)
> of two variables:
> SD=SQRT(SD_A^2 + SD_B^2 + 2*r*SD_A*SD_B)
>
> with SD_B and SD_B being the standard deviation of A and B. And r*SD_A*SD_B
> being the covariance of A and B.
>
> Would this formula also be valid if I want to average (and not sum) my two
> variables?
>
> Many thanks for any help & best wishes,
> Jokel
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/
More information about the R-help
mailing list