[BioC] Correlation works, but dist() runs out of memory
Sean Davis
sdavis2 at mail.nih.gov
Tue Mar 13 17:14:32 CET 2007
On Tuesday 13 March 2007 11:34, Daniel Brewer wrote:
> I am attempting to do plot a hierarchical clustering dendogram of a
> reasonable modestly sized gene expression matrix of 22011 x 16.
>
> If I choose to use a correlation measure it works fine (
> c2 <- cor(ExonExpr)
> d2 <- as.dist(1-c2)
> hier2 <- hclust(d2,method="average")
> ). If I try to create a Euclidean distance object it crashes out with a
> memory error (
>
> > Error in vector("double", length) : vector size specified is too large
>
> ).
>
> This seems strange as I have 3GB ram, which I would think is plenty. Any
> ideas what is going wrong or how to get round this.
Hi, Dan.
You probably want to do the dist() on the transposed matrix.
> a <- matrix(rnorm(20000),nc=10) # a 2000 x 10 matrix
> b <- dist(a)
> dim(as.matrix(b))
[1] 2000 2000
> d <- cor(a)
> dim(d)
[1] 10 10
Note the difference in sizes of the matrices.
Sean
More information about the Bioconductor
mailing list