Re: [Rd] stopping finalizers

Hadley Wickham Fri, 15 Feb 2013 17:33:22 -0800

> The subset table isn't a copy of the subset, it contains the unique key and
> an indicator column showing whether the element is in the subset.  I need
> this even if the subset is never modified, so that I can join it to the main
> table and use it in SQL 'where' conditions to get computations for the right
> subset of the data.


Cool - Is that faster than storing a column that just contains the
include indices?

>  The whole point of this new sqlsurvey package is that most of the
> aggregation operations happen in the database rather than in R, which is
> faster for very large data tables.  The use case is things like the American
> Community Survey and the Nationwide Emergency Department Subsample, with
> millions or tens of millions of records and quite a lot of variables.  At
> this scale, loading stuff into memory isn't feasible on commodity desktops
> and laptops, and even on computers with enough memory, the database
> (MonetDB) is faster.

Have you done any comparisons of monetdb vs sqlite - I'm interested to
know how much faster it is. I'm working on a package
(https://github.com/hadley/dplyr) that compiles R data manipulation
expressions into (e.g. SQL), and have been wondering if it's worth
considering a column-store like monetdb.

Hadley

-- 
Chief Scientist, RStudio
http://had.co.nz/

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] stopping finalizers

Reply via email to