On Fri, 1 Apr 2011, Henri Mone wrote:

Dear R Users,

I use for my data crunching a combination of MySQL and GNU R. I have
to handle huge/ middle seized data which is stored in a MySql
database, R executes a SQL command to fetch the data and does the
plotting with the build in R plotting functions.

The (low level) calculations like summing, dividing, grouping, sorting
etc. can be done either with the sql command on the MySQL side or on
the R side.
My question is what is faster for this low level calculations / data
rearrangement MySQL or R? Is there a general rule of thumb what to
shift to the MySql side and what to the R side?

The data transfer costs almost always dominate here: since such low-level computations would almost always be a trivial part of the total costs, you should do things which can reduce the size (e.g. summarizations) in the DBMS.

I do wonder what you think the R-sig-db list is for if not questions such as this one. Please subscribe and use it next time.

Thanks
Henri

--
Brian D. Ripley,                  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to