[R-meta] Post-hoc weighted analysis based on number of observations

Thu Jan 25 11:08:07 CET 2018

I will need to mull over this for a bit, but I think this falls under 'spatial uncertainty' (a term worth googling in the meantime).

Best,
Wolfgang

>-----Original Message-----
>From: Cesar Terrer Moreno [mailto:cesar.terrer at me.com]
>Sent: Thursday, 25 January, 2018 7:57
>To: Viechtbauer Wolfgang (SP)
>Cc: r-sig-meta-analysis at r-project.org
>Subject: Re: [R-meta] Post-hoc weighted analysis based on number of
>observations
>
>Dear Wolfgang,
>
>Thanks so much for your reply. You have captured the essence of the
>question perfectly.
>
>I have successfully scaled the meta-analysis-derived SE, so I have
>basically produced a global map of the SE of the effect:
>
>SE <- predict(meta,
>                    newmods = cbind(s.df$precipitation, s.df$temperature,
>CO2inc, s.df$temperature*CO2inc))$se
>
>However, as you said, some locations, in this case ecosystems (e.g.
>tropical forests) are poorly represented in the dataset. Therefore, a
>proper assessment of the uncertainties of the approach should account for
>the uncertainty associated with the sampling effort (or the lack of) in
>some regions. Reviewers will check this for sure.
>
>It turns out that ecosystem type, per se, is not a good predictor, thus
>including it in the meta-regression probably does not make much sense (or
>maybe yes). I was thus thinking more on a post-hoc solution, not
>necessarily in a meta-analytic context, so maybe this distribution list
>is not the right place to ask this question. The idea is to increase SE
>in pixels dominated by ecosystems that are poorly sampled. The final
>quantification of uncertainties would thus be an aggregation of the SEs
>and some sort of multiplier that adds uncertainty in a particular pixel
>as a function of the representativeness of the type of ecosystem in that
>pixel.
>
>For example:
>
>group_by(ecosystem_type) %>% summarise(n = n()) %>% mutate (weight =
>n/sum(n))
>
>SEw= max(SE,na.rm=T) - max(SE,na.rm=T)*weight,
>
>SEsum = SE + SEw
>
>SEsum would thus be the sum of SE and another level of error driven by
>the sample size of the type of ecosystem, and constrained to fall within
>the range of observed SE from the dataset.
>
>But I think this approach is not very elegant. Any other ideas?
>Thanks
>César
>
>On 24 Jan 2018, at 23:56, Viechtbauer Wolfgang (SP)
><wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
>
>Dear Cesar,
>
>Let me try to understand the essence of your question/issue and abstract
>it a bit from the specifics of your data. So, if I understand things
>correctly, you have data from various places on Earth. Let's pretend
>those places are on a 2d surface, so something like this (where *
>indicates a place where you have data):
>
>+------------------------+
>|     *                  |
>|  *                     |
>|     *                  |
>|                     *  |
>|                 *  *   |
>|                        |
>+------------------------+
>
>You have fitted a model that relates an outcome to some predictor
>variables based on the data for these places. Now you actually have the
>values of the predictor variables for *all* places on that surface and
>you have computed the corresponding predicted values. But there are
>locations for which there were no data to begin with (e.g., upper right
>and lower left) and hence you want the SEs of the predicted values to
>reflect this lack of information in those areas and you are wondering how
>to do that. Does that capture the essence of your question?
>
>Best,
>Wolfgang
>
>
>-----Original Message-----
>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-
>project.org] On Behalf Of Cesar Terrer Moreno
>Sent: Monday, 22 January, 2018 18:52
>To: r-sig-meta-analysis at r-project.org
>Subject: [R-meta] Post-hoc weighted analysis based on number of
>observations
>
>I have a gridded dataset representing the standard error (SE) of an
>effect. This SE was calculated through a meta-analysis and subsequent
>predictive model applied on a grid:
>
>ECMmeta <- rma(es, var, data=ecm.df ,control=list(stepadj=.5), mods= ~ 1
>+ MAP + MAT*CO2dif, knha=TRUE)
>options(na.action = "na.pass")
>ECMpred <- predict(ECMmeta,
>                  newmods = cbind(s.df$precipitation, s.df$temperature,
>CO2inc, s.df$temperature*CO2inc))
>ECMrelSE <- rasterFromXYZ(ECMpred[,c("x", "y", "se")],crs="+proj=longlat
>+datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0")
>
>I would like to add a further level of uncertainty to SE based on the
>number of measurements (observations) per type of ecosystem in the
>dataset. The idea is that ecosystems that are poorly represented by
>experiments in the dataset should have a higher SE than ecosystems with
>plenty of measurements in the dataset.
>
>I thought I could, for example, calculate an ecosystem-based weight as:
>
>weight = n/sum(n)
>
>That is, number of observations in a particular ecosystem divided by the
>total of observations.
>
>The next step would be to apply a weighting approach to each pixel. First
>approach I've come up with is to simply multiply SE and the inverse of
>the weight:
>
>SEw=SE*(1/weight)
>
>But the values are extremely high.
>
>An approach like this would be more like an post-hoc patch. I am sure
>something like this can be done within the meta-analysis at the
>beginning. Alternatively, a better post-hoc approach or ideas to
>investigate further would be welcome. Any recommendation or basic
>approach commonly used to add further uncertainty to areas with low
>representativeness?
>
>Thanks