RE: [R] cluster analysis: "error in vector("double", length): given vector size is too big {Fehler in vector("double", length) : angegebene Vektorgröße ist zu groß}
Liaw, Andy
andy_liaw at merck.com
Fri Jan 27 03:36:27 CET 2006
Let's do some simple calculation: The dist object from a data set with
80000 cases would have
80000 * (80000 - 1) / 2
elements, each takes 8 bytes to be stored in double precision. That's over
24GB if my arithmetic isn't too flaky. You'd have a devil of a time trying
to do this on a 64-bit machine with 32GB RAM, let alone what you are using.
You'd have much better chance sticking with algorithms that do not require
storage of the (dis)similarity matrix.
Andy
From: Markus Preisetanz
>
> Dear R Specialists,
>
>
>
> when trying to cluster a data.frame with about 80.000 rows
> and 25 columns I get the above error message. I tried hclust
> (using dist), agnes (entering the data.frame directly) and
> pam (entering the data.frame directly). What I actually do
> not want to do is generate a random sample from the data.
>
>
>
> The machine I run R on is a Windows 2000 Server (Pentium 4)
> with 2 GB of RAM.
>
>
>
> Does anybody know what to do?
>
>
>
> Sincerely
>
> ___________________
>
> Markus Preisetanz
>
> Consultant
>
>
>
> Client Vela GmbH
>
> Albert-Roßhaupter-Str. 32
>
> 81369 München
>
> fon: +49 (0) 89 742 17-113
>
> fax: +49 (0) 89 742 17-150
>
> mailto:markus.preisetanz at clientvela.com
> <mailto:markus.preisetanz at clientvela.com>
>
>
>
> Diese E-Mail enthält vertrauliche und/oder rechtlich
> geschützte Informationen. Wenn Sie nicht der richtige
> Adressat sind oder diese E-Mail irrtümlich erhalten haben,
> informieren Sie bitte sofort den Absender und vernichten Sie
> diese Mail. Das unerlaubte Kopieren sowie die unbefugte
> Weitergabe dieser E-Mail ist nicht gestattet.
>
> This e-mail may contain confidential and/or privileged
> infor...{{dropped}}
>
>
More information about the R-help
mailing list