[BioC] error under "hclust" for microarray clustering

James W. MacDonald jmacdon at med.umich.edu
Thu Jan 27 15:43:50 CET 2011


Hi Avhena,

On 1/27/2011 12:00 AM, avehna wrote:
> Hi All,
>
> I'm trying to cluster 21657 genes that are differentially expressed in my
> microarray data, but it's actually not working for me. After reading the
> normalized signal and calculating the mean for each treatment I proceed to
> read the list of genes differentially expressed (previously calculated using
> limma). The problem occurs during "hclust" function (please see below my
> code and corresponding error). Is it possible for this error to be due to
> the number of genes? when I use the same code for only 1000 genes it works
> pretty well.
>
> How could I solve this problem? I need this figure for my paper...

You need to remove the rows that have no variability. For example:

 > dat <- matrix(rnorm(1000), nc=10)
 > dat[3,] <- rep(dat[3,3], 10) ## make row three have var=0
 > hclust(as.dist(1-cor(t(dat), method="spearman")), method="complete")
Error in hclust(as.dist(1 - cor(t(dat), method = "spearman")), method = 
"complete") :
   NA/NaN/Inf in foreign function call (arg 11)
In addition: Warning message:
In cor(t(dat), method = "spearman") : the standard deviation is zero

now again, without this row

 > hclust(as.dist(1-cor(t(dat[-3,]), method="spearman")), method="complete")

Call:
hclust(d = as.dist(1 - cor(t(dat[-3, ]), method = "spearman")), 
method = "complete")

Cluster method   : complete
Number of objects: 99

something like

ind <- apply(mysubset, 1, var) == 0
mysubset <- mysubset[!ind,]

should do the trick.

Best,

Jim


>
> Thank you for your help!
>
> Sincerely,
> Avhena
>
>
>
> ************************************************
>> signal<-signal[-grep("AFFX",rownames(signal)), ,drop=FALSE]
>> pDatam<- read.AnnotatedDataFrame('pdatam.txt', row.names = 1, header =
> TRUE, sep = '\t')
>> pData<- read.AnnotatedDataFrame('pdata.txt', row.names = 1, header =
> TRUE, sep = '\t')
>> expset<- new("ExpressionSet", exprs = signal, phenoData = pData)
>
>> means1<- means(pairwise.comparison(expset, "Type", c("Control", "BMP"),
> method="logged", logged=FALSE))
>> means2<- means(pairwise.comparison(expset, "Type", c("BMP.VPA",
> "SHH.1D"), method="logged", logged=FALSE))
>> means3<- means(pairwise.comparison(expset, "Type", c("SHH.6H",
> "SHH.VPA.1D"), method="logged", logged=FALSE))
>> all_means<-cbind(means1,means2,means3)
>> expmeans<- new("ExpressionSet", exprs = all_means, phenoData = pDatam)
>> subset<-get.array.subset(expmeans, "Type", c("Control", "BMP", "SHH.1D",
> "SHH.VPA.1D"))
>> genes<-read.table("affy_ids_diff_exprs05.dat")
>> mysubset<-exprs(subset)[match(levels(genes[,]), rownames(exprs(subset))),]
>
>> hr<- hclust(as.dist(1-cor(t(mysubset), method="spearman")),
> method="complete")
>
> Error in hclust(as.dist(1 - cor(t(mysubset), method = "spearman")), method =
> "complete") :
> NA/NaN/Inf in foreign function call (arg 11)
> Calls: hclust ->  .Fortran
> In addition: Warning message:
> In cor(t(mysubset), method = "spearman") : the standard deviation is zero
> Execution halted
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 


More information about the Bioconductor mailing list