[R] testing whether clusters in a PCA plot are significantly different from one another

Marchesi, Julian j.marchesi at imperial.ac.uk
Fri Jan 6 16:32:51 CET 2017


many thanks david for such a swift response, really appreciate your help

cheers

Julian

Julian R. Marchesi

Deputy Director and Professor of Clinical Microbiome Research at the  Centre for Digestive and Gut Health, Imperial College London, London W2 1NY Tel: +44 (0)20 331 26197

and

Professor of Human Microbiome Research at the School of Biosciences, Museum Avenue, Cardiff University, Cardiff, CF10 3AT, Tel: +44 (0)29 208 74188, Fax: +44 (0)29 20874305, Mobile 07885 569144




________________________________________
From: David L Carlson <dcarlson at tamu.edu>
Sent: 06 January 2017 15:29
To: Marchesi, Julian; r-help at r-project.org
Subject: RE: [R] testing whether clusters in a PCA plot are significantly different from one another

In that case you should be able to use manova where pc1 and pc2 are the independent (response) variables and group (Baseline, HFD+P, HFD) is the dependent (explanatory) variable. Something like lm(cbind(pc1, pc2)~group). That will give you slopes for HFD+P and HFD (difference in mean relative to Baseline), t-values, and p-values for each component. You can get further diagnostics using package candisc. But your sample size is very small so there may be better approaches that a statistician specializing in medical research could suggest.

David C

-----Original Message-----
From: Marchesi, Julian [mailto:j.marchesi at imperial.ac.uk]
Sent: Friday, January 6, 2017 9:02 AM
To: David L Carlson
Subject: Re: [R] testing whether clusters in a PCA plot are significantly different from one another

Dear David

The clusters are defined by the metadata which tells R where to draw the lines - no more no less

How would I put a P value to those clusters?

cheers

Julian

Julian R. Marchesi

Deputy Director and Professor of Clinical Microbiome Research at the  Centre for Digestive and Gut Health, Imperial College London, London W2 1NY Tel: +44 (0)20 331 26197

and

Professor of Human Microbiome Research at the School of Biosciences, Museum Avenue, Cardiff University, Cardiff, CF10 3AT, Tel: +44 (0)29 208 74188, Fax: +44 (0)29 20874305, Mobile 07885 569144




________________________________________
From: David L Carlson <dcarlson at tamu.edu>
Sent: 06 January 2017 14:26
To: Marchesi, Julian
Subject: RE: [R] testing whether clusters in a PCA plot are significantly different from one another

You do not say how you defined the clusters in the plot that you attached. If you used the variables summarized by the principal components, the answer is yes, they are "significantly different".

Cluster analysis creates homogeneous clusters that will almost always be "significantly different" using standard tests such as analysis of variance. BUT these tests are only meaningful when the clusters are defined independently of the data.


David L. Carlson
Department of Anthropology
Texas A&M University



-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Marchesi, Julian
Sent: Friday, January 6, 2017 1:43 AM
To: 'r-help at r-project.org' <r-help at r-project.org>
Subject: [R] testing whether clusters in a PCA plot are significantly different from one another

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list