[R-sig-eco] Package 'compositions'; Function dnorm.acomp()
separent at yahoo.com
separent at yahoo.com
Tue Sep 30 20:35:35 CEST 2014
Hi Rich,
Filzmoser et al. (2009) wrote that "Some measures like the standard deviation (or the variance) make no statistical sense with closed data [...]". They also wrote that "If Euclidean geometry is not valid, the arithmetic mean is quite likely to be a poor estimate of the data center."
See Filzmoser et al. (2009) http://www.statistik.tuwien.ac.at/forschung/SM/SM-2009-2.pdf
As Euclidean geometry is not valid for compositions, you have to compute the mean in the ilr or clr space (both are euclidean, alr is not). The mean.acomp function computes the mean in euclidean space, then back-transform the result in the compositional space.
library(compositions)
# Data
comp = matrix(c(0.0667, 0.0612061206120612, 0.0435, 0.044, 0.05,
0.0161, 0.6, 0.571457145714572, 0.6232, 0.5934, 0.4333, 0.629,
0.0667, 0.0612061206120612, 0.1014, 0.0659, 0.0667, 0.0323, 0.2444,
0.265326532653265, 0.2174, 0.2637, 0.3667, 0.2903, 0.0222, 0.0408040804080408,
0.0145, 0.033, 0.0833, 0.0323), ncol=5)
# Mean
colMeans(unclass(comp)) ## biased mean
mean(comp) ## unbiased mean, calls mean.acomp under the hood
sbp = matrix(c( 1, 1, 1,-1,-1, ## A dummy sequential binary partition
1,-1,-1, 0, 0,
0, 1,-1, 0, 0,
0, 0, 0, 1,-1),
ncol=5, byrow=TRUE)
psi = gsi.buildilrBase(t(sbp)) ## The orthonormal matrix
balances = ilr(comp, V=psi) ## computing the orthonormal balances
bal_mean = colMeans(balances) ## means of balances
ilrInv(bal_mean, V=psi) ## back-transform the mean in the compositional space
You see that the back-transformed mean is equal to mean.acomp(comp).
The total variance estimator is computed using eq. 10 in Filzmoser et al. (2009). This is what mvar does.
# Variance
sum1 = 0
for (i in 1:(ncol(comp)-1)) {
sum2 = 0
for (j in (i+1):ncol(comp)) {
sum2 = sum2 + var(log(comp[,i]/comp[,j]))
}
sum1 = sum1+sum2
}
tot_var = sum1/ncol(comp)
tot_var
mvar(comp)
The variance-covariance matrix of compositions should be computed in a log-ratio space. So var, sd, confidence intervals and p-values should be computed on your transformed data. Although confidence intervals on compositions are widely seen in the litterature, they can be misleading.
I prefer to compute the variance in the ilr space and put the confidence intervals in a CoDaDendrogram, then put only the means on compositions in a table below the dendrogram, as in Figure 5 of Parent et al. (2012) - I’ll send you the plot function if you want it,
http://www.frontiersin.org/files/Articles/63683/fpls-04-00449-HTML/image_m/fpls-04-00449-g005.JPG
Regards,
Serge-Étienne
De : Rich Shepard
Envoyé : mardi, 30 septembre 2014 12:45
À : <r-sig-ecology at r-project.org>
For a data set of count proportions, testing for fit to a multivariate
normal distribution is done with the function dnorm.acomp() in package
'compositions'. The function's calling parameters are the data set, mean,
and variance.
Example data set:
>dput(win.acomp)
structure(c(0.0667, 0.0612061206120612, 0.0435, 0.044, 0.05,
0.0161, 0.6, 0.571457145714572, 0.6232, 0.5934, 0.4333, 0.629,
0.0667, 0.0612061206120612, 0.1014, 0.0659, 0.0667, 0.0323, 0.2444,
0.265326532653265, 0.2174, 0.2637, 0.3667, 0.2903, 0.0222, 0.0408040804080408,
0.0145, 0.033, 0.0833, 0.0323), .Dim = c(6L, 5L), .Dimnames = list(
NULL, c("Filterer", "Gatherer", "Grazer", "Predator", "Shredder"
)), class = "acomp")
The mean() function returns the mean value for each column:
> mean(win.acomp)
Filterer Gatherer Grazer Predator Shredder
0.04386630 0.58270151 0.06366245 0.27664502 0.03312472
and the multivariate function, mvar(), returns a single value:
> mvar(win.acomp)
[1] 0.6309852
The dnorm.acomp() syntax, according to ?dnorm.acomp has a single value for
the mean:
dnorm.acomp(x,mean,var)
which raises the question of which mean value do I use for a data set?
TIA,
Rich
_______________________________________________
R-sig-ecology mailing list
R-sig-ecology at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[[alternative HTML version deleted]]
More information about the R-sig-ecology
mailing list