[R] problems with a large data set

Moritz Lennert mlennert at ulb.ac.be
Wed Apr 25 18:03:24 CEST 2001


I have trouble with a data set that comprises 2136 lines of 20 columns.
I would like to do a hierarchical clustering and I tried the following:

ages.hclust <- hclust(dist(ages, method="euclidean"), "ward")

but I get the following error message:

Error: cannot allocate vector of size 17797 Kb

When I try to do the dist() alone first without the hclust(), I get the
same type of message.

Then I tried with the RPgSQL packages by typing

Connected to database "space" on "localhost" 
> bind.db.proxy("ages")
> ages.hclust <- hclust(dist(ages, method="euclidean"), "ward")

This time I get:

Error in dist(ages, method = "euclidean") : 
        NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning message: 
NAs introduced by coercion

I've checked, and I can't find any missing values of something similar.
Could someone tell me if I'm doing something wrong, or wether this is
just too much data for R ?

Please send your response directly to me since I'm not on the mailing
list, thank you.

Moritz Lennert
Chargé de recherche

tél: 32-2-650.65.16
fax: 32-2-650.50.92
email: mlennert at ulb.ac.be
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list