[R-sig-ME] Modelling non-negative non-zero continuous data

Thierry Onkelinx th|erry@onke||nx @end|ng |rom |nbo@be
Mon May 2 09:43:39 CEST 2022

Dear Vicki,

When you have only one measurement per nest box, then you can't have
"nest box" as a random effect as it would confound with the residuals.
I recommend adding "year" as a fixed effect factor. I wrote a blog post on
the required number of levels for a random effect:
I presume you did an exploratory data analysis and handle covariates with
strong correlation. Do all covariates have a linear relationship with the

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx using inbo.be
Havenlaan 88 bus 73, 1000 Brussel

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey


Op vr 29 apr. 2022 om 20:58 schreef Victoria Pattison-Willits <
victoriaswillits using gmail.com>:

> Dear all,
> I am hoping you can help me. I am trying to model chick tarsi (leg length)
> data. Briefly, I have mean measurements of tarsus length from 457 nests.
> The data were collected across 31 sites (10 nests in each site) over a
> six-year period so I have an *a priori *nested random effects structure:
> (1|SITE_ID/BOX_NUMBER) + (1|YEAR). (Although  I have had to remove the
> nested nest_box term due to convergence issues - there is a lot of variance
> between nests within sites.)
> The problem that I am running into is that the data is bound between the
> values 12.73 and 20.12 mm. Both the data itself and the residuals from a
> lmer model are left-skewed because the data is non-negative and non-zero.
> The initial suite of models I have tried follows the below: I am running
> models using both glmmTMB and lme (lmer).
> (Also I have run the same models using the same length data for a bunch of
> other response variables with no issues including various breeding outcomes
> and chick measurements). Fixed covariates are scaled and centred: (sc.)
> e.g.
> ```{r}
> data=DF_CHICK_TARSUS, family = gaussian)
> summary(TL_BUILT_FULL_TMB2)
> ```
> I am a little stumped as to what to do - I have run the same model using
> reflected and log (and/or square root) transformed data - which does seem
> to resolve the residual issues. However, I know that this is not the best
> resolution and is rarely done, and transforming data even for the more
> commonly found right-skewed data is increasingly discouraged. However, I am
> not finding (and this may be me not using the correct terms in my search!)
> any other options to overcome the issue of non-negative non-zero data -
> plenty of advice for ecological data that is right-skewed or left-skewed
> and zero-inflated!
> If anyone can help me I would really appreciate it. Thank you all so much
> as always in advance for your time and knowledge sharing. I am gradually
> building up my competence in R and mixed modelling and this forum has been
> really helpful on this steep learning curve! I am hoping I am just missing
> something obvious! Please let me know if you need any other information
> from me. Thank you!
> Very best wishes.
> Vicki Willits
> >
> > _______________________________________________
> > R-sig-mixed-models using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
>         [[alternative HTML version deleted]]
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative HTML version deleted]]

More information about the R-sig-mixed-models mailing list