[R-sig-ME] dyads nested: confused on interpretation

Sat Jul 15 18:50:56 CEST 2017

Hi Dexter - is there a good reason why you are not using a Poisson /quasi-Poisson or negative binomial regression model?  This would be a much more elegant solution to your count-date analysis (regardless of anything else).  If you Google 'why not log-transform count data' you'll find plenty of evidence to that effect.

Best

Tom.

-----Original Message-----
From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Dexter Locke
Sent: 15 July 2017 14:34
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] dyads nested: confused on interpretation

Greetings mixed modelers,

I'm fitting a three-level mixed model with lme4::lmer and struggling with the interpretation.

The dependent variable is species richness. There are zeros, and its not normal, so I've added one and then logged it. Observations are collected as pairs at sites, and therefore not independent. As Wickham (2014) notes - referencing Bolker - there is an equivalence between a t-test a mixed model in this type of case. Sites are also uniquely nested within one of two cities, hence the third level. My syntax is:

AAA <- lmer(log(richness + 1) ~fb*City + (1 | Site / City), data=wy_GardenC, REML = F)

"fb" indicates the location of the observation within the site: either front (Front) or back (Back).

Using sjPlot::sjt.lmer the p-values are calculated and formatted neatly in a table (I do understand the controversies and assumptions around using t-stats as Walk Z-stats..)

The estimated intercept is 2.77 (or 15.96 once back-transformed), the fb variable becomes "fbBack", its beta is 0.42 (or 1.52 once back-transformed). The City and fb*City interaction terms are not significant.

Can I conclude that back yards are on average ~10% (1.52/ 15.92 =  0.095) more species-rich? My confusion is that I'd think R takes b as in Back first as the base case and makes f as in Front the contrast. Plotting the data suggests that indeed back yards in Los Angeles (one of the two cities is higher):

http://dexterlocke.com/wp-content/uploads/2017/07/unnamed-1.png

I'm not interested in if all backs (on average) are greater than all fronts (on average). I'm interested in if at each site, the back is generally greater than the front. Am I specifying a corresponding model to this question? Is the front being taken as the referent, and back as reference?
Given the factors, what is being contrasted with what base-case?

Thank you for your consideration,
Dexter

Wickham, H. (2014). Tidy Data. Journal Of Statistical Software, 59(10).
Retrieved from https://www.jstatsoft.org/article/view/v059i10/v59i10.pdf

[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
The Scottish Association for Marine Science (SAMS) is registered in Scotland as a Company Limited by Guarantee (SC009292) and is a registered charity (9206). SAMS has two actively trading wholly owned subsidiary companies: SAMS Research Services Ltd (SC224404) and SAMS Ltd (SC306912). All Companies in the group are registered in Scotland and share a registered office at Scottish Marine Institute, Oban Argyll PA37 1QA. The content of this message may contain personal views which are not the views of SAMS unless specifically stated. Please note that all email traffic is monitored for purposes of security and spam filtering. As such individual emails may be examined in more detail.