Re: [R] RMySQL - Bulk loading data and creating FK links

2010-01-28 Thread Gabor Grothendieck
Regarding the explanation of where the time goes it might be parsing the statement or the development of the query plan. The SQL statement for the more complex query is obviously much longer and its generated query plan involves 95 lines of byte code vs 19 lines of generated code for the simpler q

Re: [R] RMySQL - Bulk loading data and creating FK links

2010-01-28 Thread Matthew Dowle
I'm talking about ease of use to. The first line of the Details section in ?"[.data.table" says : "Builds on base R functionality to reduce 2 types of time : 1. programming time (easier to write, read, debug and maintain) 2. compute time" Once again, I am merely saying that the

Re: [R] RMySQL - Bulk loading data and creating FK links

2010-01-28 Thread Gabor Grothendieck
I think one would only be concerned about such internals if one were primarily interested in performance; otherwise, one would be more interested in ease of specification and part of that ease is having it independent of implementation and separating implementation from specification activities. A

Re: [R] RMySQL - Bulk loading data and creating FK links

2010-01-28 Thread Matthew Dowle
Are you claiming that SQL is that utopia? SQL is a row store. It cannot give the user the benefits of column store. For example, why does SQL take 113 seconds in the example in this thread : http://tolstoy.newcastle.edu.au/R/e9/help/10/01/1872.html but data.table takes 5 seconds to get the same

Re: [R] RMySQL - Bulk loading data and creating FK links

2010-01-28 Thread Gabor Grothendieck
Its only important internally. Externally its undesirable that the user have to get involved in it. The idea of making software easy to write and use is to hide the implementation and focus on the problem. That is why we use high level languages, object orientation, etc. On Thu, Jan 28, 2010 at

Re: [R] RMySQL - Bulk loading data and creating FK links

2010-01-28 Thread Matthew Dowle
How it represents data internally is very important, depending on the real goal : http://en.wikipedia.org/wiki/Column-oriented_DBMS "Gabor Grothendieck" wrote in message news:971536df1001271710o4ea62333l7f1230b860114...@mail.gmail.com... How it represents data internally should not be importa

Re: [R] RMySQL - Bulk loading data and creating FK links

2010-01-27 Thread Gabor Grothendieck
How it represents data internally should not be important as long as you can do what you want. SQL is declarative so you just specify what you want rather than how to get it and invisibly to the user it automatically draws up a query plan and then uses that plan to get the result. On Wed, Jan 27,

Re: [R] RMySQL - Bulk loading data and creating FK links

2010-01-27 Thread Matthew Dowle
> sqldf("select * from BOD order by Time desc limit 3") Exactly. SQL requires use of order by. It knows the order, but it isn't ordered. Thats not good, but might be fine, depending on what the real goal is. "Gabor Grothendieck" wrote in message news:971536df1001270629w4795da89vb7d77af6e4e8b

Re: [R] RMySQL - Bulk loading data and creating FK links

2010-01-27 Thread Gabor Grothendieck
On Wed, Jan 27, 2010 at 8:56 AM, Matthew Dowle wrote: > How many columns, and of what type are the columns ? As Olga asked too, it > would be useful to know more about what you're really trying to do. > > 3.5m rows is not actually that many rows, even for 32bit R.  Its depends on > the columns and

Re: [R] RMySQL - Bulk loading data and creating FK links

2010-01-27 Thread Matthew Dowle
How many columns, and of what type are the columns ? As Olga asked too, it would be useful to know more about what you're really trying to do. 3.5m rows is not actually that many rows, even for 32bit R. Its depends on the columns and what you want to do with those columns. At the risk of sugge

Re: [R] RMySQL - Bulk loading data and creating FK links

2010-01-27 Thread Olga Lyashevska
Hi Nathan, I have a table (contact) with several fields and it's PK is an auto increment field. I'm bulk loading data to this table from files which if successful will be about 3.5million rows (approx 16000 rows per file). However, I have a linking table (an_contact) to resolve a m:m rela

[R] RMySQL - Bulk loading data and creating FK links

2010-01-27 Thread Nathan S. Watson-Haigh
I have a table (contact) with several fields and it's PK is an auto increment field. I'm bulk loading data to this table from files which if successful will be about 3.5million rows (approx 16000 rows per file). However, I have a linking table (an_contact) to resolve a m:m relationship between