[R] MySql Versus R

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Apr 1 13:15:09 CEST 2011


On Fri, 1 Apr 2011, Henri Mone wrote:

> Dear R Users,
>
> I use for my data crunching a combination of MySQL and GNU R. I have
> to handle huge/ middle seized data which is stored in a MySql
> database, R executes a SQL command to fetch the data and does the
> plotting with the build in R plotting functions.
>
> The (low level) calculations like summing, dividing, grouping, sorting
> etc. can be done either with the sql command on the MySQL side or on
> the R side.
> My question is what is faster for this low level calculations / data
> rearrangement MySQL or R? Is there a general rule of thumb what to
> shift to the MySql side and what to the R side?

The data transfer costs almost always dominate here: since such 
low-level computations would almost always be a trivial part of the 
total costs, you should do things which can reduce the size (e.g. 
summarizations) in the DBMS.

I do wonder what you think the R-sig-db list is for if not questions 
such as this one.  Please subscribe and use it next time.

> Thanks
> Henri

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list