ui's suggestion and use a relational database system to handle the huge
> data.
>
>
>
> > Date: Sat, 26 Jan 2008 20:40:51 -0500
>
> > From: [EMAIL PROTECTED]
> > To: [EMAIL PROTECTED]
> > Subject: Re: [R] Comparison of aggregate in R and group by in mys
40:51 -0500
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: Re: [R] Comparison of aggregate in R and group by in mysql
> CC: [EMAIL PROTECTED]
>
> I think with your data you will be computing a matrix that is 7049 x
> 11704. This will require about 700MB of memor
an))
> (I killed it after 30 minutes)
>
>
>
> > Date: Sat, 26 Jan 2008 19:55:51 -0500
> > From: [EMAIL PROTECTED]
> > To: [EMAIL PROTECTED]
> > Subject: Re: [R] Comparison of aggregate in R and group by in mysql
> > CC: [EMAIL PROTECTED]
>
> >
> >
55:51 -0500
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: Re: [R] Comparison of aggregate in R and group by in mysql
> CC: [EMAIL PROTECTED]
>
> How large is your dataframe? How much memory do you have on your
> system? Are you paging? Here is a test I ran wit
How large is your dataframe? How much memory do you have on your
system? Are you paging? Here is a test I ran with a data frame with
1,000,000 entries and it seems to be fast:
> n <- 100
> x <- data.frame(A=sample(LETTERS,n,TRUE), B=sample(letters[1:4],n,TRUE),
+ C=sample(LETTERS[1:4],
How does the it compare if you read it into R and then do your
aggregate with sqldf:
library(sqldf)
# example using builtin data set CO2
CO2agg <- sqldf("select Plant, Type, Treatment, avg(conc) from CO2
group by Plant, Type, Treatment")
# or using your data:
Xagg <- sqldf("select Group, Age, T
huali,
if i were you, i will create a view on the MySql server to aggregate
the data first and then use R to pull the data through this created
view. This is not only applicable to R but also a general guideline in
similar situation.
Per my understanding and experience, R is able to do data manipul
7 matches
Mail list logo