[R-sig-eco] Package 'compositions'; Function dnorm.acomp()

separent at yahoo.com separent at yahoo.com
Tue Sep 30 20:35:35 CEST 2014


Hi Rich,




Filzmoser et al. (2009) wrote that "Some measures like the standard deviation (or the variance) make no statistical sense with closed data [...]". They also wrote that "If Euclidean geometry is not valid, the arithmetic mean is quite likely to be a poor estimate of the data center."




See Filzmoser et al. (2009) http://www.statistik.tuwien.ac.at/forschung/SM/SM-2009-2.pdf




As Euclidean geometry is not valid for compositions, you have to compute the mean in the ilr or clr space (both are euclidean, alr is not). The mean.acomp function computes the mean in euclidean space, then back-transform the result in the compositional space.




library(compositions)

# Data
comp = matrix(c(0.0667, 0.0612061206120612, 0.0435, 0.044, 0.05, 
  0.0161, 0.6, 0.571457145714572, 0.6232, 0.5934, 0.4333, 0.629, 
  0.0667, 0.0612061206120612, 0.1014, 0.0659, 0.0667, 0.0323, 0.2444, 
  0.265326532653265, 0.2174, 0.2637, 0.3667, 0.2903, 0.0222, 0.0408040804080408, 
  0.0145, 0.033, 0.0833, 0.0323), ncol=5)

# Mean
colMeans(unclass(comp)) ## biased mean
mean(comp) ## unbiased mean, calls mean.acomp under the hood




sbp = matrix(c( 1, 1, 1,-1,-1, ## A dummy sequential binary partition
                1,-1,-1, 0, 0,
                0, 1,-1, 0, 0,
                0, 0, 0, 1,-1),
             ncol=5, byrow=TRUE)
psi = gsi.buildilrBase(t(sbp)) ## The orthonormal matrix
balances = ilr(comp, V=psi) ## computing the orthonormal balances
bal_mean = colMeans(balances) ## means of balances
ilrInv(bal_mean, V=psi) ## back-transform the mean in the compositional space




You see that the back-transformed mean is equal to mean.acomp(comp).




The total variance estimator is computed using eq. 10 in Filzmoser et al. (2009). This is what mvar does.




# Variance
sum1 = 0
for (i in 1:(ncol(comp)-1)) {
  sum2 = 0
  for (j in (i+1):ncol(comp)) {
    sum2 = sum2 + var(log(comp[,i]/comp[,j]))
  }
  sum1 = sum1+sum2
}
tot_var = sum1/ncol(comp)
tot_var
mvar(comp)




The variance-covariance matrix of compositions should be computed in a log-ratio space. So var, sd, confidence intervals and p-values should be computed on your transformed data. Although confidence intervals on compositions are widely seen in the litterature, they can be misleading.




I prefer to compute the variance in the ilr space and put the confidence intervals in a CoDaDendrogram, then put only the means on compositions in a table below the dendrogram, as in Figure 5 of Parent et al. (2012) - I’ll send you the plot function if you want it, 
http://www.frontiersin.org/files/Articles/63683/fpls-04-00449-HTML/image_m/fpls-04-00449-g005.JPG




Regards,




Serge-Étienne








De : Rich Shepard
Envoyé : ‎mardi‎, ‎30‎ ‎septembre‎ ‎2014 ‎12‎:‎45
À : <r-sig-ecology at r-project.org>





   For a data set of count proportions, testing for fit to a multivariate
normal distribution is done with the function dnorm.acomp() in package
'compositions'. The function's calling parameters are the data set, mean,
and variance.

   Example data set:

>dput(win.acomp)
structure(c(0.0667, 0.0612061206120612, 0.0435, 0.044, 0.05, 
0.0161, 0.6, 0.571457145714572, 0.6232, 0.5934, 0.4333, 0.629, 
0.0667, 0.0612061206120612, 0.1014, 0.0659, 0.0667, 0.0323, 0.2444, 
0.265326532653265, 0.2174, 0.2637, 0.3667, 0.2903, 0.0222, 0.0408040804080408, 
0.0145, 0.033, 0.0833, 0.0323), .Dim = c(6L, 5L), .Dimnames = list(
     NULL, c("Filterer", "Gatherer", "Grazer", "Predator", "Shredder"
     )), class = "acomp")

   The mean() function returns the mean value for each column:

> mean(win.acomp)
   Filterer   Gatherer     Grazer   Predator   Shredder
0.04386630 0.58270151 0.06366245 0.27664502 0.03312472

and the multivariate function, mvar(), returns a single value:

> mvar(win.acomp)
[1] 0.6309852

   The dnorm.acomp() syntax, according to ?dnorm.acomp has a single value for
the mean:

dnorm.acomp(x,mean,var)

which raises the question of which mean value do I use for a data set?

TIA,

Rich

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
	[[alternative HTML version deleted]]



More information about the R-sig-ecology mailing list