[R-sig-ME] seeking some advice on fixed vs random specification
Peter Claussen
dakotajudo at mac.com
Tue Nov 1 16:56:29 CET 2011
David,
Have you considered that is TIME*REGION is crossed as fixed effects, you should also treat them as crossed if they are random effects (and not nested), thus
lmer(log(CONT) ~ WB_TYPE + PORT*len + (1|TIME) + (1|REGION/wb_id))
Are TIME and REGION considered to be two independent sources of random variation, which would be implied by this model?
If you want model variation across time differently for each region, then perhaps (TIME | REGION/wb_id) may be more appropriate.
I would interpret (1|TIME/REGION), based on analogy to (1 | BLOCK/PLOT), to mean the REGION identified as "1" in TIME "A" would not be in any way related to REGION "1" in TIME "B"; that is, the region identifier only has meaning within the context of the time identifier.
Peter Claussen
Gylling Data Management
On Oct 31, 2011, at 9:27 AM, david depew wrote:
> Dear list,
> I am seeking some thoughts/advice on whether my approach to this problem
> (below) makes sense.
>
> We have compiled a rather large dataset (n> 25,000 for most species of
> interest) on the levels of a contaminant in fish covering 40 years and a
> continental scale. We would like to investigate broad temporal changes
> across a large geographic region. Because the data comes from a variety of
> sources, with different resources and mandates for sampling fish, we do not
> consider this dataset to be a "true random sample", but in the absence of
> such, this is the best possible approximation to one.
>
> Sites that are sampled over time are generally not sampled frequently
> enough and with sufficient constraints (sample sizes, sizes of fish) to do
> more focused analysis of temporal trends.
>
> Having spent some time perusing the resources available on mixed models, I
> think this offers the best choice for making some sense of this messy
> dataset. I'm less inclined to try and estimate site specific slopes
> (regressed over year) for sites that have low sampling effort.
>
> Rather, I split the dataset into time periods (A,B and C) of ~ 15 year
> blocks. (Note: the levels of this particular contaminant are known to
> change very slowly over time), and assigned each site to an ecoregion
> based on geographic location. Thus, I am aiming to assess (if possible)
> whether levels of contaminant in each ecoregion change over the time blocks
> (A,B and C), where sites are assumed to represent a random selection of
> possible locations within an ecoregion.
>
> The variables of interest are as follows;
> CONT=contaminant Conc.
> WB_TYPE = waterbody type (lake, river)
> PORT = portion (fillet, whole fish)
> len=mean centered length of fish
> REGION=Ecoregion (37 unique types)
> TIME= time block (A, B or C)
> wb_id=unique id of site
>
> My initial thought was to specify the model with time and region as fixed
> effects.
>
> lmer(log(CONT) ~ TIME*REGION + WB_TYPE + PORT*len + (1|wb_id))
>
> comparison of this model with one with only additive time and region terms
> suggests that this improves the model fit and the interaction is probably
> important.
>
> I can test TIME and REGION interaction contrasts specifically using the
> multcomp package and the results indeed suggest some regions have
> significant changes between time blocks.
>
> Or,
>
> would it make more sense to specify the time and region effects as part of
> the random terms with site nested within region, nested within time period?
>
> lmer(log(CONT) ~ WB_TYPE + PORT*len + (1|TIME/REGION/wb_id))
>
> I'm assuming (perhaps wrongly) that the conditional means and 95% CI could
> be extracted and compared to assess changes within a region?
>
> I'm aware that there are arguments that can be made to treat TIME and
> REGION as either fixed or random, depending on the objective of the
> analysis. I'm mainly seeking some clarification if a) my interpretation of
> the specified model is correct, and b) if this makes sense with respect to
> the initial problem.
>
> Any thoughts or advice would be much appreciated.
>
> thanks
>
>
>
> --
> David Depew
> Postdoctoral Fellow
> School of Environmental Studies
> Queen's University
> Kingston, Ontario
> K7L 3N6
>
> david.depew at queensu.ca
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
More information about the R-sig-mixed-models
mailing list