[R-sig-eco] Data transformations and ordinations on relative abundance data subsets

trichter trichter at uni-bremen.de
Thu May 7 14:38:14 CEST 2015


Hi, these are follow up questions to
http://stats.stackexchange.com/questions/151126/chi-square-transformation-on-a-partially-unknown-matrix/151151#151151
Sorry for partial cross posting. I did do some commenting on Gavin's answer,
but i am unsure if they are read.

I have a dataset of bacterial sequencing reads. Each sample consists of ~1
million bacterial reads. I extracted reads representing a single phylum, and
did a computationally highly intensive OTU clustering only on this subset.
Until now, i have first transformed my absolute counts to values relative to
ALL reads, and passed this relativized subset to data analysis. I am
interested to know about shifts not only within the subset but also in
relation to the complete dataset.

I learned that CCA does an internal chi-square transformation on my
abundance matrix. Not only does it incorporate a second relativization (on
the subset rowsums of a relative dataset), but it does consider the (subset)
matrix total as well. I have this feeling that my CCA is flawed, as the
information of the total dataset is lost and my data input is not suitable.

The first thing to adress this is to use absolute counts. Still, the CCA
wouldnt know about the row.sums of the total matrix.
I had the idea to take my subset and add a dummy OTU (containing all "other"
reads) to it.
But
a) passing this to CCA, i wouldnt know how to remove this "Super-OTU" from
the analysis (when it needs to be included in the transformation).
b) doing a manual chi2-transformation, manually removing this "Super-OTU",
and pass that resulting matrix to CCA would not prevent the CCA to do an
internal transformation again.

Do you have any idea, how to perform a CCA with different row.sums than
given in the matrix passed to CCA?


That would be extremely awesome. my only alternative would be to skip the
entire "observations relative to all bacteria"-approach and only consider
shifts within the subset.

Many, many thanks!



--
View this message in context: http://r-sig-ecology.471788.n2.nabble.com/Data-transformations-and-ordinations-on-relative-abundance-data-subsets-tp7579436.html
Sent from the r-sig-ecology mailing list archive at Nabble.com.



More information about the R-sig-ecology mailing list