[BioC] Correlation works, but dist() runs out of memory
Wolfgang Huber
huber at ebi.ac.uk
Tue Mar 13 18:12:14 CET 2007
Dear Daniel,
Please read the posting guide that recommends that you give a
reproducible example and the output of sessionInfo. Also, there is no
such thing as Bioconductor 0.9.
1) Are you sure you are giving it "only" a 22011 x 16 matrix? I get
> a=numeric(2^31-1)
Error in vector("double", length) : cannot allocate vector of length
2147483647
> a=numeric(2^31)
Error in vector("double", length) : vector size specified is too large
and of course 2^31 >> choose(22011,2).
2) choose(22011,2)*8/1e6 = 1937.84 i.e. one copy of your distance matrix
would need 2 GB RAM, and if you have other large stuff around or if it
needs to be copied, your 3 GB RAM may not be enough. Rather than brute
force, thinking about reducing the set of genes to an interesting subset
before doing the clustering might help.
> sessionInfo()
R version 2.5.0 Under development (unstable) (2007-03-13 r40832)
i686-pc-linux-gnu
locale:
LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_GB.UTF-8;LC_MONETARY=en_GB.UTF-8;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDENTIFICATION=C
attached base packages:
[1] "stats" "graphics" "grDevices" "utils" "datasets" "methods"
[7] "base"
Best wishes
Wolfgang
------------------------------------------------------------------
Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber
> I am attempting to do plot a hierarchical clustering dendogram of a
> reasonable modestly sized gene expression matrix of 22011 x 16.
>
> If I choose to use a correlation measure it works fine (
> c2 <- cor(ExonExpr)
> d2 <- as.dist(1-c2)
> hier2 <- hclust(d2,method="average")
> ). If I try to create a Euclidean distance object it crashes out with a
> memory error (
>> Error in vector("double", length) : vector size specified is too large
> ).
>
> This seems strange as I have 3GB ram, which I would think is plenty. Any
> ideas what is going wrong or how to get round this.
>
>
> Thanks
>
> Dan
>
> PS Running R 2.4.1, Bioconductor 0.9 on SUSE 10.2 Linux.
>
--
More information about the Bioconductor
mailing list