[R-sig-ME] contrasts vs. directly modelling differences: big diff?

Wed Sep 29 14:25:54 CEST 2021

I will not speak to the mixed-effect case but here are some thoughts.

Andrew Vickers published a paper (in BMC Medical Research Methodology (2001) 1:6) that demonstrates a loss of efficiency analyzing percent change as opposed to an ANCOVA on the raw data.

Second, as I recall, when you analyze on a log scale and exponentiate to obtain percent change, this gives a symmetric percent change estimate, while the usual calculation of percent difference is not.

I see no obvious reason why these ideas would not carry over to mixed-effect models.

> On Sep 29, 2021, at 8:04 AM, Guillaume Adeux <guillaumesimon.a2 using gmail.com> wrote:
> 
> Good afternoon everyone,
> 
> I have this recurring question of whether it is best to directly model the
> response as % differences (e.g. Yield_loss=(Yield_without_weeds -
> Yield_with_weeds)/(Yield_without_weeds)) or whether it is best to directly
> model the response (e.g. Yield) and compute yield loss through post hoc
> contrasts on the log scale.
> 
> I hope this following example can illustrate better:
> Let's take two different weed communities: WC1 and WC2.
> Each community is present on 6 fields, with multiple samples per field.
> Next to each weedy sample of WC1 and WC2 within each of the 6 fields, there
> is a hand weed control, inducing a hierarchical structure (paired data
> within each field for each of the two weed communities).
> The objective is to compute yield loss induced by the two communities and
> to compare them.
> One option could be to directly compute yield loss (e.g.
> Yield_loss=(Yield_without_weeds - Yield_with_weeds)/(Yield_without_weeds))
> for each weedy/weeded couple within each field and model
> * mod0=glmer(Yield_loss~WC+(1|field)+(1|field:WC),family="binomial"),data=yl)*
> (I
> suppose beta or beta_binomial would also be a reasonable choice but it's
> not the matter of today). Comparisons could then be made with
> *cld(emmeans(mod0,~WC))*
> 
> Another option could be to directly model the response (e.g. Yield),
> introduce a "Handweeding" (yes/no) variable and compute Yield loss through
> the following code:
> *mod1=lmer(log(Yield)~Handweeding*WC+(1|field)+(1|field:Handweeding)+(1|field:WC)+(1|field:Handweeding:WC),data=yl)*
> 
> *x=pairs(emmeans(mod1,~Handweeding|WC),reverse=TRUE)
> y=regrid(x,transform="response")
> *# differences on the log scale are exponentiated
> *summary(y,infer=c(TRUE,TRUE),null=1) *# is yield loss significantly
> different from 0 for each of the 2 community?
> *cld(emmeans(y,~WC),adjust="mvt") *# is yield loss induced by WC1 different
> from yield loss induced by WC2?
> 
> Are both these "procedures" correct? Which is preferable? Why?
> 
> Do not hesitate to request further information if I wasn't clear enough.
> Thanks a lot.
> Guillaume ADEUX
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
Kevin E. Thorpe
Head of Biostatistics,  Applied Health Research Centre (AHRC)
Li Ka Shing Knowledge Institute of St. Michael's
Assistant Professor, Dalla Lana School of Public Health
University of Toronto
email: kevin.thorpe using utoronto.ca  Tel: 416.864.5776  Fax: 416.864.3016