[R-meta] Post-hoc weighted analysis based on number of observations

Thu Jan 25 13:32:30 CET 2018

Thanks Wolfgang. I´m reading some stuff about spatial uncertainty and it’s indeed interesting, though complex, so I am bit lost at the moment. Please, take your time to think about it. I can send you my code and data so far.

Cheers,
Cesar

> On 25 Jan 2018, at 11:08, Viechtbauer Wolfgang (SP) <wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
> 
> I will need to mull over this for a bit, but I think this falls under 'spatial uncertainty' (a term worth googling in the meantime).
> 
> Best,
> Wolfgang
> 
>> -----Original Message-----
>> From: Cesar Terrer Moreno [mailto:cesar.terrer at me.com]
>> Sent: Thursday, 25 January, 2018 7:57
>> To: Viechtbauer Wolfgang (SP)
>> Cc: r-sig-meta-analysis at r-project.org
>> Subject: Re: [R-meta] Post-hoc weighted analysis based on number of
>> observations
>> 
>> Dear Wolfgang,
>> 
>> Thanks so much for your reply. You have captured the essence of the
>> question perfectly.
>> 
>> I have successfully scaled the meta-analysis-derived SE, so I have
>> basically produced a global map of the SE of the effect:
>> 
>> SE <- predict(meta,
>>                     newmods = cbind(s.df$precipitation, s.df$temperature,
>> CO2inc, s.df$temperature*CO2inc))$se
>> 
>> However, as you said, some locations, in this case ecosystems (e.g.
>> tropical forests) are poorly represented in the dataset. Therefore, a
>> proper assessment of the uncertainties of the approach should account for
>> the uncertainty associated with the sampling effort (or the lack of) in
>> some regions. Reviewers will check this for sure.
>> 
>> It turns out that ecosystem type, per se, is not a good predictor, thus
>> including it in the meta-regression probably does not make much sense (or
>> maybe yes). I was thus thinking more on a post-hoc solution, not
>> necessarily in a meta-analytic context, so maybe this distribution list
>> is not the right place to ask this question. The idea is to increase SE
>> in pixels dominated by ecosystems that are poorly sampled. The final
>> quantification of uncertainties would thus be an aggregation of the SEs
>> and some sort of multiplier that adds uncertainty in a particular pixel
>> as a function of the representativeness of the type of ecosystem in that
>> pixel.
>> 
>> For example:
>> 
>> group_by(ecosystem_type) %>% summarise(n = n()) %>% mutate (weight =
>> n/sum(n))
>> 
>> SEw= max(SE,na.rm=T) - max(SE,na.rm=T)*weight,
>> 
>> SEsum = SE + SEw
>> 
>> SEsum would thus be the sum of SE and another level of error driven by
>> the sample size of the type of ecosystem, and constrained to fall within
>> the range of observed SE from the dataset.
>> 
>> But I think this approach is not very elegant. Any other ideas?
>> Thanks
>> César
>> 
>> On 24 Jan 2018, at 23:56, Viechtbauer Wolfgang (SP)
>> <wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
>> 
>> Dear Cesar,
>> 
>> Let me try to understand the essence of your question/issue and abstract
>> it a bit from the specifics of your data. So, if I understand things
>> correctly, you have data from various places on Earth. Let's pretend
>> those places are on a 2d surface, so something like this (where *
>> indicates a place where you have data):
>> 
>> +------------------------+
>> |     *                  |
>> |  *                     |
>> |     *                  |
>> |                     *  |
>> |                 *  *   |
>> |                        |
>> +------------------------+
>> 
>> You have fitted a model that relates an outcome to some predictor
>> variables based on the data for these places. Now you actually have the
>> values of the predictor variables for *all* places on that surface and
>> you have computed the corresponding predicted values. But there are
>> locations for which there were no data to begin with (e.g., upper right
>> and lower left) and hence you want the SEs of the predicted values to
>> reflect this lack of information in those areas and you are wondering how
>> to do that. Does that capture the essence of your question?
>> 
>> Best,
>> Wolfgang
>> 
>> 
>> -----Original Message-----
>> From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-
>> project.org] On Behalf Of Cesar Terrer Moreno
>> Sent: Monday, 22 January, 2018 18:52
>> To: r-sig-meta-analysis at r-project.org
>> Subject: [R-meta] Post-hoc weighted analysis based on number of
>> observations
>> 
>> I have a gridded dataset representing the standard error (SE) of an
>> effect. This SE was calculated through a meta-analysis and subsequent
>> predictive model applied on a grid:
>> 
>> ECMmeta <- rma(es, var, data=ecm.df ,control=list(stepadj=.5), mods= ~ 1
>> + MAP + MAT*CO2dif, knha=TRUE)
>> options(na.action = "na.pass")
>> ECMpred <- predict(ECMmeta,
>>                   newmods = cbind(s.df$precipitation, s.df$temperature,
>> CO2inc, s.df$temperature*CO2inc))
>> ECMrelSE <- rasterFromXYZ(ECMpred[,c("x", "y", "se")],crs="+proj=longlat
>> +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0")
>> 
>> I would like to add a further level of uncertainty to SE based on the
>> number of measurements (observations) per type of ecosystem in the
>> dataset. The idea is that ecosystems that are poorly represented by
>> experiments in the dataset should have a higher SE than ecosystems with
>> plenty of measurements in the dataset.
>> 
>> I thought I could, for example, calculate an ecosystem-based weight as:
>> 
>> weight = n/sum(n)
>> 
>> That is, number of observations in a particular ecosystem divided by the
>> total of observations.
>> 
>> The next step would be to apply a weighting approach to each pixel. First
>> approach I've come up with is to simply multiply SE and the inverse of
>> the weight:
>> 
>> SEw=SE*(1/weight)
>> 
>> But the values are extremely high.
>> 
>> An approach like this would be more like an post-hoc patch. I am sure
>> something like this can be done within the meta-analysis at the
>> beginning. Alternatively, a better post-hoc approach or ideas to
>> investigate further would be welcome. Any recommendation or basic
>> approach commonly used to add further uncertainty to areas with low
>> representativeness?
>> 
>> Thanks