[R] MySql Versus R
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Apr 1 13:15:09 CEST 2011
On Fri, 1 Apr 2011, Henri Mone wrote:
> Dear R Users,
> I use for my data crunching a combination of MySQL and GNU R. I have
> to handle huge/ middle seized data which is stored in a MySql
> database, R executes a SQL command to fetch the data and does the
> plotting with the build in R plotting functions.
> The (low level) calculations like summing, dividing, grouping, sorting
> etc. can be done either with the sql command on the MySQL side or on
> the R side.
> My question is what is faster for this low level calculations / data
> rearrangement MySQL or R? Is there a general rule of thumb what to
> shift to the MySql side and what to the R side?
The data transfer costs almost always dominate here: since such
low-level computations would almost always be a trivial part of the
total costs, you should do things which can reduce the size (e.g.
summarizations) in the DBMS.
I do wonder what you think the R-sig-db list is for if not questions
such as this one. Please subscribe and use it next time.
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help