[R] Can R handle medium and large size data sets?

Martin Lam tmlammail at yahoo.com
Tue Jan 24 21:13:07 CET 2006


Dear Gueorgui,

> Is it true that R generally cannot handle  medium
> sized data sets(a
> couple of hundreds of thousands observations) and
> threrefore large
> date set(couple of millions of observations)?

It depends on what you want to do with the data sets.
Loading the data sets shouldn't be any problem I
think. But using the data sets for analysis using self
written R code can get (very) slow,  since R is an
interpreted language (correct me if I'm wrong). To
increase speed you will often need to experiment with
the R code. For example, what I've noticed is that
processing data sets as matrices works much faster
than data.frame(). Writing your code in C(++), compile
it and include it in your R code is often the best
way.

HTH,

Martin




More information about the R-help mailing list