[R] clustering based on most significant pvalues does not separate the groups!

S Ellison S.Ellison at LGCGroup.com
Wed Jul 6 11:13:53 CEST 2011


t-tests and the like test for a difference in mean value, not for non-overlapping populations or data sets.

The fact that the mean  of one data set differs significantly from the mean of the other does not mean that the ranges of the individual points in each data set are disjoint.

set.seed(1023)

x<-rnorm(60, 10)
y<-x+0.75
boxplot(x,y)
	#Lots of overlap for individual points
t.test(x,y)
	#Strongly significant difference

Does that correspond to your situation well enough to account for your puzzlement?


S Ellison

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of pguilha
> Sent: 04 July 2011 19:22
> To: r-help at r-project.org
> Subject: [R] clustering based on most significant pvalues 
> does not separate the groups!
> 
> Hi all,
> 
> I have some microarray data on 40 samples that fall into two 
> groups. I have a value for 480k probes for each of those 
> samples. I performed a t test
> (rowttests) on each row(giving the indices of the columns for 
> each group) then used p.adjust() to adjust the pvalues for 
> the number of tests performed. I then selected only the 
> probes with adj-p.value<=0.05. I end up with roughly 2000 
> probes to do the clustering on but using pvclust, and hclust, 
> the samples do no split up into the two groups. I would have 
> imagined that using only those values that are significantly 
> different between the two groups, the clustering should 
> surely reflect that?
> 
> Please, what am I missing!!!!???
> 
> Thanks!
> 
> Paul
> 
> PS: I am hoping I have just thought this through in the wrong 
> way and there is a simple explanation, but can provide the 
> code I am using for clustering if necessary!
> 
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/clustering-based-on-most-signifi
> cant-pvalues-does-not-separate-the-groups-tp3644249p3644249.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> *******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}



More information about the R-help mailing list