[R-sig-ME] hypothesis testing - full model doesn't converge

Wed Mar 15 09:11:42 CET 2023

Hello everyone,
I'm currently trying to analyse the relationship between crop diversity and
pesticide use at a national scale (4k farms covering 7 climatic regions),
while accounting for crop identity effects (proportion of each crop in the
cropping system), in order to distinguish "true diversity" effects from
"dilution effects" (introducing a meadow in a cropping system necessarily
reduces pesticide use).
The most simple model would be something like:
*mod=glmmTMB(pesticide_use~climatic_region + crop_1 to crop_25 +
crop_diversity + (1|climatic_region:soil)+(1|year),
family=Tweedie(link="log"), data=pest)*
This model converges but doesn't account for the fact that - for a given
crop - pesticide use might be more important under certain weather
conditions (i.e. climatic regions), which is something well known for
ecologists and agronomists.
Hence, a more complete model would be :
*mod_inter=glmmTMB(pesticide_use~climatic_region * (crop_1 to crop_25 +
crop_diversity) + (1|climatic_region:soil)+(1|year),
family=Tweedie(link="log"), data=pest)*
However, this "full" model doesn't converge, even when I try to boost the
number of iterations or the fitting algorithm.
I was hence tempted to identify the most complete model that converges (for
example via the R buildmer package) but I understand from my reading that
model selection and hypothesis testing don't go well together (pvalues are
biased after model selection). I wasn't able to find whether or not it was
possible to correct these pvalues after model selection.
I was suggested (by Bert van der Veen) to look into variable selection
(glmmLasso for example) but it appears less flexible (Tweedie or gamma
families are not implemented and I will most likely have to correct for
spatial autocorrelation in my selected model, which is possible in glmmTMB
but not glmmLasso).
Because of multicollinearity problems including crop diversity, I was
thinking of identifying the most complete crop model (only interactions
between climatic region and crop proportions) and comparing it to the same
model including crop diversity and its interaction with crop region.
How would you more knowledgeable folks go about tackling this problem? Is
this procedure a little dubious ?
Thanks a lot for your feedback.
Have a great day.
GA2

	[[alternative HTML version deleted]]