Hi Weijia, This will give you the rownames of the split variables. lst1 <- split(a,list(a$COUNTRY,a$SITEID))
res <- t(sapply(lst1,function(x) { x$SCRNDT <- as.Date(x$SCRNDT, "%d-%b-%y") unlist(lapply(split(b,b$TMPT),function(y){ sum(x$SCRNDT >= y$DT_ETP & x$SCRNDT <= y$DT_END) })) })) colnames(res) <- paste0("WEEK",colnames(res)) res[5:8,1:4] # WEEK1 WEEK2 WEEK3 WEEK4 #USA.5 0 0 0 4 #USA.6 0 0 0 1 #USA.8 0 0 0 0 #USA.9 0 0 0 0 A.K. On , arun <smartpink...@yahoo.com> wrote: Yes, Sorry a typo. I was in a hurry when I sent it. On Wednesday, October 16, 2013 2:32 PM, Weijia Wang <zelee...@gmail.com> wrote: Arun I think you mean sum(x$SCRNDT>= y$DT_ETP & x$SCRNDT <= y$DT_END) instead of DT_ETP at the end of this line, right? W 2013/10/16 arun <smartpink...@yahoo.com> HI Weijia, > >Please check whether this is what you wanted. > >Weijia <- load("/home/arunksa111/Downloads/arun_help.RData" ) >a[sapply(a,is.factor)] <-lapply(a[sapply(a,is.factor)],as.character) >str(a) > b[sapply(b,is.factor)] <- lapply(b[sapply(b,is.factor)],as.character) >str(b) > b$DT_ETP <- as.Date(b$DT_ETP,"%d-%b-%y") > b$DT_END <- c(b$DT_ETP[-1]-1, b$DT_ETP[length(b$DT_ETP)]+6) >res <- >do.call(rbind,lapply(split(a,list(a$COUNTRY,a$SITEID),drop=TRUE),function(x){ >x$SCRNDT <- as.Date(x$SCRNDT, "%d-%b-%y") >do.call(cbind,lapply(split(b,b$TMPT),function(y) { > sum(x$SCRNDT>= y$DT_ETP & x$SCRNDT <= y$DT_ETP) >})) > })) > > colnames(res) <- paste0("WEEK",colnames(res)) > > > >A.K. > > > > > >On Wednesday, October 16, 2013 10:48 AM, Weijia Wang <zelee...@gmail.com> >wrote: > >Hi, Arun > >Here I attached a R object with two dataframes. > >The first one is the one I need to count the number of dates that fall into a >certain week interval. > >For example, > > STUDYID COUNTRY SITEID SUBJID SCRNDT >1 GRTMD101 USA 13 130101 4-Dec-12 >2 GRTMD101 USA 13 130102 4-Dec-12 >3 GRTMD101 USA 13 130103 4-Dec-12 >4 GRTMD101 USA 6 60101 5-Dec-12 >5 GRTMD101 USA 5 50101 5-Dec-12 >6 GRTMD101 USA 13 130104 6-Dec-12 > >So I will need to count number so dates under 'SCRNDT' for each unique >'SITEID' that falls into the >following 'DT_ETP' > > STUDYID DT_ETP TMPT >1 GRTMD101 9-Nov-12 1 >2 GRTMD101 16-Nov-12 2 >3 GRTMD101 23-Nov-12 3 >4 GRTMD101 30-Nov-12 4 >5 GRTMD101 7-Dec-12 5 >6 GRTMD101 14-Dec-12 6 > >For example, for site 13, the 1st and 2nd SCRNDT both are 4-Dec-12 that falls >into the week from '30-Nov-12' to '7-Dec-12', then in the result dataset, it >should say, 2 under the variable 'week 4', and the result data frame should >look like this: > >SITEID WEEK1 WEEK2 WEEK3 WEEK4.......WEEK64 (The values are just an example) > 1 0 0 1 2 3 > 2 1 2 3 5 6 > . . . . . . > >Here is a code I modified based on the one you sent me, but quite intuitively, >and R returns an empty vector to me, lol, > > > library(plyr) > res <- ddply(a,.(COUNTRY, SITEID), function(x) { > x$SCRNDT <-as.Date(x$SCRNDT,"%d-%b-%y") > unsplit(lapply(split(b,b$TMPT),function(y) { > numweek<-as.numeric(length(unique(b$TMPT))) > for (i in 1:numweek) { > y$DT_ETP <- as.Date(y$DT_ETP, >"%d-%b-%y") > sum(x$SCRNDT > y$DT_ETP[i] & >x$SCRNDT <= y$DT_ETP[i+1])}}), > b$TMPT)}) > > > > >I really hope you can teach me on this one again! Thank you so much!! > >W > > > > > >2013/10/15 arun <smartpink...@yahoo.com> > >HI Weijia, >>No problem. >>Let me know if it works. One slight modification as I noticed that your >>weeks are not overlapping >>#change >> >> >> >>sum(x$Date > y$from & x$Date <= y$to) >> >>#to >>sum(x$Date >= y$from & x$Date <= y$to >> >>Regards, >>Arun >> >> >> >> >> >> >>On Tuesday, October 15, 2013 8:14 PM, Weijia wang <zelee...@gmail.com> wrote: >>Thank you Arun! I will try it and get back to you! You are amazing! >> >>Weijia Wang >> >> >>> On Oct 15, 2013, at 8:00 PM, arun <smartpink...@yahoo.com> wrote: >>> >>> Hi, >>> >>> Please use ?dput() to show the dataset. Also, it is not clear about how >>> you store the time interval. >>> dat <- read.table(text=" >>> GroupID Date >>> 1 1 10-Dec-12 >>> 2 1 11-Dec-12 >>> 3 2 13-Dec-12 >>> 4 2 15-Dec-12 >>> 5 3 06-Dec-12 >>> 6 3 19-Dec-12",sep="",header=TRUE,stringsAsFactors=FALSE) >>> >>> >>> >>> dat2 <- data.frame(Week=1:2, from= c("9-Dec-12", "16-Dec-12"), >>>to=c("15-Dec-12","22-Dec-12"),stringsAsFactors=FALSE) >>> >>> #Check ?findInterval() >>> >>> res <- t(sapply(split(dat,dat$GroupID), function(x) { >>> x$Date <-as.Date(x$Date,"%d-%b-%y") >>> >>>unsplit(lapply(split(dat2,dat2$Week),function(y) { >>> y$from <- as.Date(y$from, >>>"%d-%b-%y") >>> y$to <- as.Date(y$to, "%d-%b-%y") >>> sum(x$Date > y$from & x$Date <= >>>y$to)}), >>> dat2$Week) >>> })) >>> >>> colnames(res) <- paste0("Week",1:2) >>> res >>> # Week1 Week2 >>> #1 2 0 >>> #2 2 0 >>> #3 0 1 >>> >>> >>> A.K. >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> On Tuesday, October 15, 2013 6:24 PM, Weijia Wang <zelee...@gmail.com> >>> wrote: >>> Hi, I have something very interesting: >>> >>> Say I have this: >>> >>> GroupID Date >>> 1 1 10-Dec-12 >>> 2 1 11-Dec-12 >>> 3 2 13-Dec-12 >>> 4 2 15-Dec-12 >>> 5 3 06-Dec-12 >>> 6 3 19-Dec-12 >>> >>> Now, I have time interval, >>> >>> week 1: from 9-Dec-12 to 15-Dec-12, >>> >>> week 2: from 16-Dec-12 to 22-Dec-12, and so on. >>> >>> Obviously, the 1st, 2nd, 3rd, 4th row falls to week 1, 5th rows should not >>> be counted, 6th row falls into week2. >>> >>> Therefore, by GroupID, I will have >>> >>> GroupID=1, Week1=2, Week2=0 >>> GroupID=2, Week1=2, Week2=0 >>> GroupID=3, Week1=0, Week2=1. >>> >>> I just want to count the valid date that falls into a 7-day week interval, >>> and I shall have new variables for EACH WEEK, and the counts for dates that >>> fall into this week interval. >>> >>> Can anyone please help me on programming this? >>> >>> W >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> > > >-- > >--Sent via Gmail > >Weijia Wang >Division of Epidemiology > >Department of Population Health, School of Medicine >New York University, NY, 10016 > -- --Sent via Gmail Weijia Wang Division of Epidemiology Department of Population Health, School of Medicine New York University, NY, 10016 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.