[R-sig-ME] contrasts vs. directly modelling differences: big diff?

Wed Sep 29 14:27:40 CEST 2021

Dear Guillaume,

I prefer to model the yield as that is the response of your experiment. It
is easier to find a suitable distribution for the yield.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx using inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////

<https://www.inbo.be>

Op wo 29 sep. 2021 om 14:04 schreef Guillaume Adeux <
guillaumesimon.a2 using gmail.com>:

> Good afternoon everyone,
>
> I have this recurring question of whether it is best to directly model the
> response as % differences (e.g. Yield_loss=(Yield_without_weeds -
> Yield_with_weeds)/(Yield_without_weeds)) or whether it is best to directly
> model the response (e.g. Yield) and compute yield loss through post hoc
> contrasts on the log scale.
>
> I hope this following example can illustrate better:
> Let's take two different weed communities: WC1 and WC2.
> Each community is present on 6 fields, with multiple samples per field.
> Next to each weedy sample of WC1 and WC2 within each of the 6 fields, there
> is a hand weed control, inducing a hierarchical structure (paired data
> within each field for each of the two weed communities).
> The objective is to compute yield loss induced by the two communities and
> to compare them.
> One option could be to directly compute yield loss (e.g.
> Yield_loss=(Yield_without_weeds - Yield_with_weeds)/(Yield_without_weeds))
> for each weedy/weeded couple within each field and model
> *
> mod0=glmer(Yield_loss~WC+(1|field)+(1|field:WC),family="binomial"),data=yl)*
> (I
> suppose beta or beta_binomial would also be a reasonable choice but it's
> not the matter of today). Comparisons could then be made with
> *cld(emmeans(mod0,~WC))*
>
> Another option could be to directly model the response (e.g. Yield),
> introduce a "Handweeding" (yes/no) variable and compute Yield loss through
> the following code:
>
> *mod1=lmer(log(Yield)~Handweeding*WC+(1|field)+(1|field:Handweeding)+(1|field:WC)+(1|field:Handweeding:WC),data=yl)*
>
> *x=pairs(emmeans(mod1,~Handweeding|WC),reverse=TRUE)
> y=regrid(x,transform="response")
> *# differences on the log scale are exponentiated
> *summary(y,infer=c(TRUE,TRUE),null=1) *# is yield loss significantly
> different from 0 for each of the 2 community?
> *cld(emmeans(y,~WC),adjust="mvt") *# is yield loss induced by WC1 different
> from yield loss induced by WC2?
>
> Are both these "procedures" correct? Which is preferable? Why?
>
> Do not hesitate to request further information if I wasn't clear enough.
> Thanks a lot.
> Guillaume ADEUX
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]