# [R-meta] Collapsing a between subject factor

Oliver Clark oliver.clark3 at stu.mmu.ac.uk
Mon Jan 29 12:14:04 CET 2018

```Dear Michael,

Many thanks for your response.  Indeed, the sample sizes are unequal which is apparently why it was treated as two analyses.

I’ve been playing with this example others and your example below overestimates the variance.  I think this is because the means are being squared rather than the delta from the combined:

S_c <- (n_1 * (v_1 + (m_1 - m_c) ^2 ) + n_2*( v_2 + ( m_2 - m_c) ^2) ) / ( n_1 + n_2)

This still overestimates the known population variance of 4, so applying the bessel correction:

S_c_2 <- ( (n_1 - 1 )*( v_1 +( m_1 - m_c ) ^2 ) + ( n_2 - 1)*( v_2 + ( m_2 - m_c)^2) ) / ( ( n_1 + n_2) -1 )

leads to a good estimate of the combined variance.  Code:

> M <- rnorm(34,5,3)
> F <- rnorm(57,5,3)
>
> comb <- c(M,F)
> n_1 = 34
> n_2 = 57
> m_1 = mean(M)
> m_2 = mean(F)
>
> v_1 = sd(M)^2
> v_2 = sd(F)^2
>
> m_c = (n_1 * m_1 + n_2 * m_2) / (n_1 + n_2)
>
> S_c_2 <- ( (n_1 - 1 )*( v_1 +( m_1 - m_c ) ^2 ) + ( n_2 - 1)*( v_2 + ( m_2 - m_c)^2) ) / ( ( n_1 + n_2) -1 )
>
> sd(comb) - sqrt(S_c_2)
[1] 0.001710072

Best wishes,

Oliver

> On 29 Jan 2018, at 10:02, Michael Dewey <lists at dewey.myzen.co.uk> wrote:
>
> Dear Oliver
>
> You do not say whether the sample sizes are equal or not so I give the procedure for unequal.
>
> For the means you need to weight by sample size
>
> (n_1 * m_1 + n_2 * m_2) / (n_1 + n_2)
>
> where n are sample sizes and m means
>
> For variance you need
>
> (n_1 * (m_1^2 + v_1) + n_2 * (m_2^2 + v_2) / (n_1 + n_2)) - m_c
>
> where v are variances and m_c is the combined mean you got above.
>
> I suggest double checking this with a few examples in case of transcription errors at my end or yours.
>
> Michael
>
>
> On 28/01/2018 21:49, Oliver Clark wrote:
>> Hi all,
>> I am currently coding studies for a meta-analysis and have come across a case in which I have a set of studies in which all but one do not include sex as a between subject factor.  The reason given was unequal cell sizes, differences in visual stimuli (it is not clear what these differences are so they are unlikely to be systematic, rather an artefact)  and strength differences between men and women.
>> With my limited experience, I don’t see the benefit in treating these both as separate cases and was wondering whether it would make sense to merge the means and SDs for both groups and use that with the total N to calculate an effect size?
>> Combining the means seems relatively straightforward but I am not sure how to do the standard deviations.  I have tried averaging the variance in the following simulation to get there but must admit that I am stabbing in the dark!:
>>> M <- rnorm(10,5,2)
>>> F <- rnorm(10,5,2)
>>>
>>> comb <- c(M,F)
>>>
>>> (mean(M) + mean(F)) / 2 == mean(comb)
>> [1] TRUE
>>>
>>> sqrt((sd(M)^2 + sd(F)^2)/2) == sd(comb)
>> [1] FALSE
>> Can anyone offer any advice on the best path for this? Should I treat them as different studies, attempt to merge the means and SDs, use a different aggregation method or omit this study?
>> Many thanks,
>> Oliver Clark
>> PhD Student
>> Manchester Metropolitan University
>>  _______________________________________________
>> R-sig-meta-analysis mailing list
>> R-sig-meta-analysis at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>
> --
> Michael
> http://www.dewey.myzen.co.uk/home.html

```