[R-sig-ME] lmer nesting

Fri Mar 11 15:13:05 CET 2022

Dear list members,

I have a conceptual doubt about mixed modeling for the following data.

Soybean grain samples were collected all around North and South
America during 2012-2013-2014 cropping seasons, and analysed for
several grain quality biochemical features. Hierarchical sampling
structure: The American continent* was divided into three regions: US,
Brazil (including Paraguay) and The Pampas (including Argentina and
Uruguay). These three main regions were subdivided into sub_regions:
north, south, central. Each subregion was integrated by all states or
provinces in each subregion.

The main objective is to analyze variation and variance for regions,
subregions and years, in particular: 1) to assess differences among
regions in each quality variable; 2) to have an idea of the stability
of this pattern. compare the proportion of the variation explained by
region and proportion of the variation explained by year; 3)  to
compare variability within a region (among subregions) against
variability among regions.

Our first attempt was fitting a mixed model where region, subregion
and state were fixed effects and year was a random one: "mod".

I basically wonder if this model syntax is ok to address our
objectives (in this case for protein) or should I modify the nesting
factors.

So far I have generated a nested variable: subregion = sub_region
nested in a region and  state = state_proc nested in the subregion.

mod<- lmer(prot_db ~ region  + subregion + state +(1|year) +
(1|region:year) + (1|subregion:year), data = df)

#VCA extracts variances fixed and random
anovaVCA(form= prot_db~ region + (year) + region:(year)+ subregion +
subregion:(year)+ state , Data=df)

Sorry if it was too long, I'm pretty newby in modeling so any help is
highly appreciated.

Best,
Anibal Cerrudo

*The American continent produces 90% of the soybean globally, and
represents 60% of the protein that the world consumes each year.
Environment and genetics affect composition (quality) and there is a
need to determine spatials patterns. The study is oriented to end
users, where to buy soybean for different end uses.