Re: [R] problem of data manipulation

2010-01-20 Thread Matthew Dowle
I see now, thanks for explaining that. Would it be for you to add data.table methods to ddply then, for this to happen? Or does a ddply method need to be added to data.table? "hadley wickham" wrote in message news:f8e6ff051001200825q4009a122m122082a9df5fe...@mail.gmail.com... > On Wed, Jan 20

Re: [R] problem of data manipulation

2010-01-20 Thread hadley wickham
On Wed, Jan 20, 2010 at 8:43 AM, Matthew Dowle wrote: > Sounds like a good idea. Would it be possible to give an example of how to > combine plyr with data.table, and why that is better than a data.table only > solution ? Well, ideally, you'd do: adt <- data.table(a) ans2 <- ddply(a, c("var1", "

Re: [R] problem of data manipulation

2010-01-20 Thread Matthew Dowle
Sounds like a good idea. Would it be possible to give an example of how to combine plyr with data.table, and why that is better than a data.table only solution ? "hadley wickham" wrote in message news:f8e6ff051001200624r2175e38xf558dc8fa3fb6...@mail.gmail.com... > Note that in the documentaton

Re: [R] problem of data manipulation

2010-01-20 Thread hadley wickham
> Note that in the documentaton ?"[.data.table" where I say that 'by' is slow, > I mean relative to how fast it could be.  Its seems, in this specific > example anyway, and with the code posted so far, to be significantly faster > than sqldf and plyr. Of course the best of both worlds would be to

Re: [R] problem of data manipulation

2010-01-20 Thread Matthew Dowle
The user wrote in their first post : > I have a lot of observations in my dataset Heres one way to do it with a data.table : a=data.table(a) ans = a[ , list(dt=dt[dt-min(dt)<7]) , by="var1,var2,var3"] class(ans$dt) = "Date" Timings are below comparing the 3 methods. In

Re: [R] problem of data manipulation

2010-01-19 Thread Gabor Grothendieck
Using data frame, a, from the post below this is how it would be done in SQL using sqldf. We join together the original table, a, with a table of minimums (computed by the nested select) and then choose only the rows where dt - mindt < 7 (in the where clause). > library(sqldf) > sqldf("select va

Re: [R] problem of data manipulation

2010-01-19 Thread hadley wickham
On Mon, Jan 18, 2010 at 1:54 PM, Bert Gunter wrote: > One way to do it: > > 1. Convert your date column to the Date class using the as.Date() function. > This allows you to do the necessary arithmetic on the dates below. > dt <- as.Date(a[,4],"%d/%m/%Y") > > 2. Create a factor out of your first th

Re: [R] problem of data manipulation

2010-01-18 Thread Bert Gunter
p Cc: Bert Gunter; r-help@r-project.org Subject: Re: [R] problem of data manipulation I just remembered that my actual dataset for var2 and var3 are numerical data,e.g. 12.34, not factors. The above example data is misleading.   Suppose var2 and var3 are numerical variables, not factors. How shoul

Re: [R] problem of data manipulation

2010-01-18 Thread rusers.sh
> > -Original Message- > > From: Bert Gunter [mailto:gunter.ber...@gene.com] > > Sent: Monday, January 18, 2010 12:32 PM > > To: William Dunlap; 'rusers.sh'; r-help@r-project.org > > Subject: RE: [R] problem of data manipulation > > > > Absolutely... so

Re: [R] problem of data manipulation

2010-01-18 Thread rusers.sh
Thank you so much. I got it. 2010/1/18 William Dunlap > > -Original Message- > > From: Bert Gunter [mailto:gunter.ber...@gene.com] > > Sent: Monday, January 18, 2010 12:32 PM > > To: William Dunlap; 'rusers.sh'; r-help@r-project.org > > Subje

Re: [R] problem of data manipulation

2010-01-18 Thread William Dunlap
> -Original Message- > From: Bert Gunter [mailto:gunter.ber...@gene.com] > Sent: Monday, January 18, 2010 12:32 PM > To: William Dunlap; 'rusers.sh'; r-help@r-project.org > Subject: RE: [R] problem of data manipulation > > Absolutely... so long as yo

Re: [R] problem of data manipulation

2010-01-18 Thread Bert Gunter
:15 PM To: Bert Gunter; rusers.sh; r-help@r-project.org Subject: Re: [R] problem of data manipulation > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter > Sent: Monday, January 18, 2010 11:54 AM > To:

Re: [R] problem of data manipulation

2010-01-18 Thread William Dunlap
> -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter > Sent: Monday, January 18, 2010 11:54 AM > To: 'rusers.sh'; r-help@r-project.org > Subject: Re: [R] problem of data manipulation &

Re: [R] problem of data manipulation

2010-01-18 Thread Bert Gunter
ject.org] On Behalf Of rusers.sh Sent: Monday, January 18, 2010 10:40 AM To: r-help@r-project.org Subject: [R] problem of data manipulation Hello, See my problem below. a<-data.frame(c("s","c","c","n","n","n"),c(rep(1,3),rep(2,3)),c(rep(

[R] problem of data manipulation

2010-01-18 Thread rusers.sh
Hello, See my problem below. a<-data.frame(c("s","c","c","n","n","n"),c(rep(1,3),rep(2,3)),c(rep(2,3),rep(1,3)),c("01/01/1999","10/02/2000","13/02/2000","11/02/2000","15/02/2000","23/02/2000")) colnames(a)<-c("var1","var2","var3","var4") > a var1 var2 var3 var4 1s1201/01/1