Re: [R] Reading SQL data - all at once, or "as needed"?

Jeff Newmiller Tue, 15 Jul 2014 08:02:15 -0700

The actual answer (which way is faster) in general can go either way depending 
how much data you extract and how well optimized the SQL queries and indexes 
are. The amount of data you mention fit one or give weeks seems small for most 
modern computers, though (you did not mention how many columns but it would 
have to be a lot to change the story).


I would go for all five weeks at once, and only break it up if I had problems.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnew...@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

On July 15, 2014 7:38:49 AM PDT, John McKown <john.archie.mck...@gmail.com> 
wrote:
>I have some data in an SQL data base (PostgreSQL to be exact). This
>data
>base (one table in particular) has a lot of data in it. Years worth, in
>fact (2.3 million rows since 2012). That just for background.
>
>My question is: If I am going to create 5 separate graphs, one graph
>for
>each of the previous 5 weeks (in this case a week is Sunday to
>Saturday),
>is it better to read in all 5 weeks worth of data in a single
>dbGetQuery
>where the SELECT has a WHERE clause which will get the proper 5 weeks
>worth
>of data, then subset in R. Or is it better to get a single weeks worth
>of
>data in a dbGetQuery, with the proper SELECT ... WHERE. And then
>process
>each week. There seems to be anywhere from 7,500 to 9,000 entries for a
>single week. The calculations for each week are independent of any
>other
>week's data. Basically I am just creating a simple bar chart.
>
>In the first case, I do one SELECT; and then subset the data.frame data
>in
>the for() loop. In the second case, I still use a for() loop, but I do
>a
>SELECT in each iteration, but don't need to subset the data.frame.
>
>I have read the "Data Import/Export". The only advice I can find is
>based
>on:
><quote>
>The sort of statistical applications for which DBMS might be used are
>to
>extract a 10% sample of the data, to cross-tabulate data to produce a
>multi-dimensional contingency table, and to extract data group by group
>from a database for separate analysis.
></quote>
>
>The "extract data group by group ..." seems, to me, to say that I
>should
>extract and process each week's data separately, using a dbGetQuery in
>my
>loop. Am I interpreting this correctly?
>
>Thanks for your advice.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading SQL data - all at once, or "as needed"?

Reply via email to