[R] Estimate average standard deviation of mean of two dependent groups

Joshua Wiley jwiley.psych at gmail.com
Wed Aug 25 17:18:01 CEST 2010


Hi Jokel,

If I remember correctly

1) The variance of the sum of two variables X and Y is:
   var(X) + var(Y) + (2 * cov(X, Y))

   If you create some sample data, you can verify this by:
   var((X + Y))

2) The variance of a random variable (X) multiplied by some constant (K)
    is equal to the the variance of X multiplied by the square of K.  So:

    var(X * K) = var(X) * K^2

I have never seen these combined, but I imagine they could be.
Let Z = X + Y
The mean of X and Y is (X + Y)/2 = Z/2
>From the formula above (1), the variance of Z is:

var(X) + var(Y) + (2 * cov(X, Y))

Because Z/2 = Z * 0.5, the variance of Z * 0.5 is given by:

var(Z) * (0.5^2)

and substituting, the variance of the mean of X and Y is:

(var(X) + var(Y) + (2 * cov(X, Y))) * (0.5^2)

This held up under several empirical tests I did where I had actual
data for X and Y.

***Discalimer****
I am piecing this together, and I am far from an expert on the
subject.  I do not know if it is actually true, and there may be
additional assumptions.

I tested this using:

myfun <- function() {
  x <- rnorm(100)
  y <- rnorm(100)
  var.x <- var(x)
  var.y <- var(y)
  cov.xy <- cov(x, y)
  calc.var.xplusy <- var.x + var.y + 2*cov.xy
  var.meanxy <- calc.var.xplusy * (.5^2)
  empirical.var.meanxy <- var( (x + y)/2 )
  output <- all.equal(var.meanxy, empirical.var.meanxy)
  return(output)
}

temp <- vector("logical", 1000)
for(i in 1:1000) {temp[i] <- myfun()}
all(temp)

HTH,

Josh

On Wed, Aug 25, 2010 at 5:29 AM, Jokel Meyer <jokel.meyer at googlemail.com> wrote:
> Dear R-experts!
>
> I am currently running a meta-analysis with the help of the great metafor
> package. However I have some difficulties setting up my raw data to enter it
> to the meta-analysis models.
> I have a group of subjects that have been measured in two continuous
> variables (A & B). I have the information about the mean of the two
> variables, the group size (n) and the standard deviations of the two
> variables.
> Now I would like to average both variables (A & B) and get the mean and
> standard deviation of the merged variable (C).
> As for the mean this would be quiet easy: I would just take the mean of mean
> A and mean B to get the mean of C.
> However for the standard deviation this seems more tricky as it is to assume
> that standard deviations in A & B correlate. I assume (based on further
> analysis) a correlation of r =0.5.
> I found the formula to get the standard deviation of the SUM (not the mean)
> of two variables:
> SD=SQRT(SD_A^2 + SD_B^2 + 2*r*SD_A*SD_B)
>
> with SD_B and SD_B being the standard deviation of A and B. And r*SD_A*SD_B
> being the covariance of A and B.
>
> Would this formula also be valid if I want to average (and not sum) my two
> variables?
>
> Many thanks for any help & best wishes,
> Jokel
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/



More information about the R-help mailing list