[R-meta] variance explained by fixed & random effects

Thu Mar 14 17:13:30 CET 2019

Dear Wolfgang,

Thank-you for the fast reply! 

So the code is easy enough, but I am struggling to understand some of the nuance/ how to properly think about the different numbers. I have added some questions, between the dashed lines, below your answers. If they are within the scope of the mailing list and not to cumbersome, I would be curious for some explanations to help clear up my confusion. 

Thanks!

Theresa

One way of estimating an R^2-type measure for the fixed effects is to fit the reduced model without the predictor and then compute the proportional reduction in the variance components:

res0 <- rma.mv(yi, vi, random = ~ 1 | district/school, data=dat) 
pmax(0, (res0$sigma2 - res$sigma2) / res0$sigma2) * 100
max(0, (sum(res0$sigma2) - sum(res$sigma2)) / sum(res0$sigma2)) * 100

One can compute this per component or overall. There is no guarantee that the value is >= 0, so I use pmax()/max() to set negative values to 0. One could debate whether it even makes sense here to estimate how much of the school-level heterogeneity is accounted for by 'year'(since this is a district-level predictor), but that's another issue.

-----------------------------------------------------------------------------------------------------------------------------------------
***
So we have: 

sigma^2 for res0 .... heterogeneity in district & school:
0.06506194 0.03273652
sigma^2 for res1 .... heterogeneity in district & school, accounting for year:
0.11482670 0.03208976

(res0$sigma2 - res$sigma2) / res0$sigma2
-0.76488265  0.01975659

So it looks like accounting for year increases the heterogeneity in district.

Acknowledging and moving on from the fact that year is not the best fixed effect... What does it mean if the value is negative/zero? 
There is more variance in the random intercepts when you include the fixed effect? 
Adding this particular fixed effect does not improve the model? 
If I was writing this up, could I say: "Accounting for year does not explain any additional variation in the estimated variance of the random intercepts."
More specifically: 
"The heterogeneity accounted for by year is 0% for district level variance, and 1.98% for school level variance."
"The overall heterogeneity accounted for by year is 0%." ???

Thinking about marginal and conditional R^2 for mixed-effects models, there is a separation of fixed and random effects: fixed effects account for X% of the variation
and random effects accounting for Y% of the variation and Z% residual variance. But here the fixed effect is somehow included 
in the sigma^2 (estimated variance of the random intercepts)?  This just confuses me a bit/ I missed the thought process. 

***
-----------------------------------------------------------------------------------------------------------------------------------------

I am not sure how to think of the question how much variance is explained by a random effect here. One could ask how much of the total heterogeneity 
is due to district- and school-level variance, which would be:

res0$sigma2 / sum(res0$sigma2) * 100

Or one could ask how much of the unaccounted for heterogeneity (so heterogeneity not accounted for by 'year') is due to district- and school-level variance, which would be:

res$sigma2 / sum(res$sigma2) * 100

-----------------------------------------------------------------------------------------------------------------------------------------
***
So, to double check, the heterogeneity accounted for by year is, what you showed above: 

max(0, (sum(res0$sigma2) - sum(res$sigma2)) / sum(res0$sigma2)) * 100

We also know that: 

res0$sigma2 / sum(res0$sigma2) * 100  = 66.52655; 33.47345

max(0, (sum(res0$sigma2) - sum(res$sigma2)) / sum(res0$sigma2)) * 100 = 0

Since these have the same denominator... is it true that: 
[max(0, (sum(res0$sigma2) - sum(res$sigma2)) / sum(res0$sigma2)) * 100] + sum(res0$sigma2 / sum(res0$sigma2) * 100) = 100
(I check on my data and the answer seems to be no...)
***
-----------------------------------------------------------------------------------------------------------------------------------------

Or one could ask how much of the total variance (so heterogeneity + sampling variance) is due to district- and school-level variance, which would be:

k <- res$k
wi <- 1/dat$vi
s2 <- (k-1) * sum(wi) / (sum(wi)^2 - sum(wi^2))

res0$sigma2 / (sum(res0$sigma2) + s2) * 100

(here I am using the Higgins & Thompson, 2002, definition of a 'typical' sampling variance).

Or one could ask how much of the unaccounted for variance (so unaccounted for heterogeneity [heterogeneity not accounted for by 'year'] + sampling variance)
is due to district- and school-level variance, which would be:

res1$sigma2 / (sum(res1$sigma2) + s2) * 100

-----------------------------------------------------------------------------------------------------------------------------------------
***
Can one ask here how much of the total variance (estimate of variance in true effects + sample variation) is accounted for by year? This is really what I want. 
***
-----------------------------------------------------------------------------------------------------------------------------------------

> On March 13, 2019 at 6:00 PM "Viechtbauer, Wolfgang (SP)" <wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> 
> 
> Dear Theresa,
> 
> You forgot some parts in your code, so to make this example reproducible, let's use this model:
> 
> dat <- dat.konstantopoulos2011 
> res <- rma.mv(yi, vi, mods = ~ factor(year), random = ~ 1 | district/school, data=dat)
> 
> And I would consider schools nested within districts here, but this isn't pertinent to the issue.
> 
> One way of estimating an R^2-type measure for the fixed effects is to fit the reduced model without the predictor and then compute the proportional reduction in the variance components:
> 
> res0 <- rma.mv(yi, vi, random = ~ 1 | district/school, data=dat) 
> pmax(0, (res0$sigma2 - res$sigma2) / res0$sigma2) * 100
> max(0, (sum(res0$sigma2) - sum(res$sigma2)) / sum(res0$sigma2)) * 100
> 
> One can compute this per component or overall. There is no guarantee that the value is >= 0, so I use pmax()/max() to set negative values to 0. One could debate whether it even makes sense here to estimate how much of the school-level heterogeneity is accounted for by 'year' (since this is a district-level predictor), but that's another issue.
> 
> I am not sure how to think of the question how much variance is explained by a random effect here. One could ask how much of the total heterogeneity is due to district- and schol-level variance, which would be:
> 
> res0$sigma2 / sum(res0$sigma2) * 100
> 
> Or one could ask how much of the unaccounted for heterogeneity (so heterogeneity not accounted for by 'year') is due to district- and schol-level variance, which would be:
> 
> res$sigma2 / sum(res$sigma2) * 100
> 
> Or one could ask how much of the total variance (so heterogenity + sampling variance) is due to district- and schol-level variance, which would be:
> 
> k <- res$k
> wi <- 1/dat$vi
> s2 <- (k-1) * sum(wi) / (sum(wi)^2 - sum(wi^2))
> res0$sigma2 / (sum(res0$sigma2) + s2) * 100
> 
> (here I am using the Higgins & Thompson, 2002, definition of a 'typical' sampling variance).
> 
> Or one could ask how much of the unaccounted for variance (so unaccounted for heterogenity + sampling variance) is due to district- and schol-level variance, which would be:
> 
> res1$sigma2 / (sum(res1$sigma2) + s2) * 100
> 
> These are at least some possibilities.
> 
> Best,
> Wolfgang
> 
> -----Original Message-----
> From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org] On Behalf Of Theresa Stratmann
> Sent: Wednesday, 13 March, 2019 17:29
> To: r-sig-meta-analysis using r-project.org
> Subject: [R-meta] variance explained by fixed & random effects
> 
> Dear Meta-Analysis Community,
> 
> I am using the metafor package to help me summarize some data on habitat selection. I have a question whose answer I have not found yet on the useful metafor website, and was hoping that someone could help me (or point me to an online resource I missed). 
> 
> As part of the description of my meta-analysis I would like to explain how much of the variation in habitat selection is due to my fixed effects (season) versus my random effects (individual & year). 
> 
> For a normal mixed-effects model I know how to compute the marginal and conditional R^2 (Nakagawa & Schielzeth 2013), but I am wondering how to do this correctly for a meta-analysis, or maybe, better said, talk about variance explained by the fixed and random effects. 
> 
> First I looked into I^2 and the post on this for random effect models. But then I read more from Borenstein et al. (https://www.meta-analysis-workshops.com/download/common-mistakes1.pdf) and was not quiet sure anymore if this is what I want (I^2 = proportion of observed variance that reflects variance in the true effects and not the sampling error ... so this seems more of a summary for the entire model (?), which also needs to be treated with care and presented properly). 
> 
> Then I found this question on this mailing list: https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2017-September/000232.html  ...but cannot really get this code to produce anything sensible, because I guess I am not clear on the formula for res0 vs. res1. 
> 
> So I will give a different example, which closely approximates what I am trying to do (a model with fixed & random effects): 
> 
> dat <- dat.konstantopoulos2011 
> res <- rma.mv(yi, vi, mods = ~ factor(year), list( ~ 1 | district , ~ 1|school)  data=dat) 
> res 
> 
> How would you calculate / correctly describe variance explained by the fixed effects (year)? By the random effects (district, school)? I am not tied to a particular statistic, just want to make sure that whatever I do is an honest representation of my results and gives readers (& myself) the information they need to interpret the results. 
> 
> I am new to meta-analyses and this is just a small part of my project, so I have been able to do some reading, but not an in-depth dive. Therefore I would appreciate any help you can offer.
> 
> Many thanks,
> 
> Theresa Stratmann 
> 
> B.S. Ecology, The University of Georgia 
> M.S. Wildlife & Fisheries Biology, Clemson University 
> 
> PhD Student 
> Goethe University 
> and 
> Senckenberg Biodiversity and Climate Research Center