[R-sig-ME] error distribution or transformation for Acid-neutralizing-capacity (ANC)
ethomas at stanford.edu
Tue Apr 22 19:21:46 CEST 2014
brooks, try y = log(ANC + 17). is the distrn approx normal. i wd justify the offset of 17 as necessitated by the data, etc. and it shdn't change your interpretation of originally negative values (now y < 17).
On Apr 22, 2014, at 8:37 AM, Brooks Miner wrote:
> I am a longtime user of nlme and then lme4 packages, and for the most part I know what I‚m doing.
> At the moment I‚m grappling with a particularly difficult response variable that I would like to analyze in a mixed-effects model: Acid neutralizing-capacity (ANC)<https://en.wikipedia.org/wiki/Acid_neutralizing_capacity> ( http://en.wikipedia.org/wiki/Acid_neutralizing_capacity ). Although a very important measurement for lakes and streams, especially those recovering from acidification, ANC has difficult properties as a response variable: for example, in my current dataset the values range from -16 to ca. 400, and they are not normally distributed by any stretch of the imagination (see this figure: http://www.eeb.cornell.edu/miner/images/ANC_histogram.png )
> There are negative values that are really important because they indicate water bodies in especially bad shape (called „Acute Concern‰ by the National Acid Precipitation Assessment Program).
> I have time-series data for ANC from 1988 to the present for 60 sampling sites, and I‚d *really* like to use a mixed-effects model, with a random effect of „Site,‰ to model how ANC values have been changing over time, overall across the 60 sites. A random effect for „Site‰ is an ideal way to deal with the temporal pseudoreplication inherent in the time-series data.
> My challenge: how to deal with my non-normal ANC response variable using lmer() or glmer()? Of course when I run it with a Gaussian error distribution, the Q-Q plot of residuals looks terrible. Because of the negative values, I can‚t log- or sqrt-transform, use Box-Cox, or use family=„Gamma‰. All of the existing literature analyzing ANC time series uses non-parametric methods (such as a Mann-Kendall test), but I‚d really like to move beyond that in order to take advantage of a (G)LMM in order to draw general conclusions across all 60 sampling sites.
> Any suggestions for how to deal with this frustratingly unique ANC response variable?
> Many thanks ~
> - Brooks
> Brooks Miner
> Postdoctoral Fellow
> Department of Ecology & Evolutionary Biology
> Cornell University
> [[alternative HTML version deleted]]
> R-sig-mixed-models at r-project.org mailing list
More information about the R-sig-mixed-models