[R] CoDA: Count Zeros in Biological Data
Rich Shepard
rshepard at appl-ecosys.com
Fri Jun 20 03:23:58 CEST 2014
I have several small biological count-based data sets with one or more
rows having zero proportion. The other proportions in the row sum to 1.000
(or 0.9999 in the sixth data row below because of rounding errors in the
computer). An example is:
sampdate filter gather graze predate shred
2000-07-18 0.0550 0.5596 0.0734 0.2294 0.0826
2003-07-08 0.0734 0.6147 0.0183 0.2294 0.0642
2005-07-13 0.1161 0.5714 0.0357 0.1696 0.1071
2006-06-28 0.1000 0.4667 0.1500 0.1333 0.1500
2010-09-14 0.0778 0.6111 0.0444 0.1889 0.0778
2011-07-13 0.0879 0.5714 0.0659 0.2747 0.0000
2012-07-11 0.1042 0.5313 0.0625 0.2396 0.0625
My concern is that in most field-biological (ecological/environmental)
data there can be two explanations for zero counts: the organism was not
present on that date or it was present but not collected. There is no way to
determine which case holds true in each instance, but the ecological
interpretations differ.
The zCompositions package offers several methods of imputing a value to
replace the zeros. As I'm completely new to compositional data analyses
(CoDA) I would appreciate advice on how to select the most appropriate
method for these data sets. The available methods are: Geometric Bayesian
multiplicative, BM, (GBM, default); square root BM (SQ); Bayes-Laplace BM
(BL); count zero multiplicative (CZM); user-specified hyper-parameters
(user).
These biological data seem to me to be different from geochemical or
economic data I see in package data sets or the CoDA references I've
acquired and read.
Advice and suggestions (including references to application of CoDA to
ecological/environmental data) will be appreciated.
Rich
More information about the R-help
mailing list