HI Weijia, Please check whether this is what you wanted.
Weijia <- load("/home/arunksa111/Downloads/arun_help.RData" ) a[sapply(a,is.factor)] <-lapply(a[sapply(a,is.factor)],as.character) str(a) b[sapply(b,is.factor)] <- lapply(b[sapply(b,is.factor)],as.character) str(b) b$DT_ETP <- as.Date(b$DT_ETP,"%d-%b-%y") b$DT_END <- c(b$DT_ETP[-1]-1, b$DT_ETP[length(b$DT_ETP)]+6) res <- do.call(rbind,lapply(split(a,list(a$COUNTRY,a$SITEID),drop=TRUE),function(x){ x$SCRNDT <- as.Date(x$SCRNDT, "%d-%b-%y") do.call(cbind,lapply(split(b,b$TMPT),function(y) { sum(x$SCRNDT>= y$DT_ETP & x$SCRNDT <= y$DT_ETP) })) })) colnames(res) <- paste0("WEEK",colnames(res)) A.K. On Wednesday, October 16, 2013 10:48 AM, Weijia Wang <zelee...@gmail.com> wrote: Hi, Arun Here I attached a R object with two dataframes. The first one is the one I need to count the number of dates that fall into a certain week interval. For example, STUDYID COUNTRY SITEID SUBJID SCRNDT 1 GRTMD101 USA 13 130101 4-Dec-12 2 GRTMD101 USA 13 130102 4-Dec-12 3 GRTMD101 USA 13 130103 4-Dec-12 4 GRTMD101 USA 6 60101 5-Dec-12 5 GRTMD101 USA 5 50101 5-Dec-12 6 GRTMD101 USA 13 130104 6-Dec-12 So I will need to count number so dates under 'SCRNDT' for each unique 'SITEID' that falls into the following 'DT_ETP' STUDYID DT_ETP TMPT 1 GRTMD101 9-Nov-12 1 2 GRTMD101 16-Nov-12 2 3 GRTMD101 23-Nov-12 3 4 GRTMD101 30-Nov-12 4 5 GRTMD101 7-Dec-12 5 6 GRTMD101 14-Dec-12 6 For example, for site 13, the 1st and 2nd SCRNDT both are 4-Dec-12 that falls into the week from '30-Nov-12' to '7-Dec-12', then in the result dataset, it should say, 2 under the variable 'week 4', and the result data frame should look like this: SITEID WEEK1 WEEK2 WEEK3 WEEK4.......WEEK64 (The values are just an example) 1 0 0 1 2 3 2 1 2 3 5 6 . . . . . . Here is a code I modified based on the one you sent me, but quite intuitively, and R returns an empty vector to me, lol, library(plyr) res <- ddply(a,.(COUNTRY, SITEID), function(x) { x$SCRNDT <-as.Date(x$SCRNDT,"%d-%b-%y") unsplit(lapply(split(b,b$TMPT),function(y) { numweek<-as.numeric(length(unique(b$TMPT))) for (i in 1:numweek) { y$DT_ETP <- as.Date(y$DT_ETP, "%d-%b-%y") sum(x$SCRNDT > y$DT_ETP[i] & x$SCRNDT <= y$DT_ETP[i+1])}}), b$TMPT)}) I really hope you can teach me on this one again! Thank you so much!! W 2013/10/15 arun <smartpink...@yahoo.com> HI Weijia, >No problem. >Let me know if it works. One slight modification as I noticed that your weeks >are not overlapping >#change > > > >sum(x$Date > y$from & x$Date <= y$to) > >#to >sum(x$Date >= y$from & x$Date <= y$to > >Regards, >Arun > > > > > > >On Tuesday, October 15, 2013 8:14 PM, Weijia wang <zelee...@gmail.com> wrote: >Thank you Arun! I will try it and get back to you! You are amazing! > >Weijia Wang > > >> On Oct 15, 2013, at 8:00 PM, arun <smartpink...@yahoo.com> wrote: >> >> Hi, >> >> Please use ?dput() to show the dataset. Also, it is not clear about how you >> store the time interval. >> dat <- read.table(text=" >> GroupID Date >> 1 1 10-Dec-12 >> 2 1 11-Dec-12 >> 3 2 13-Dec-12 >> 4 2 15-Dec-12 >> 5 3 06-Dec-12 >> 6 3 19-Dec-12",sep="",header=TRUE,stringsAsFactors=FALSE) >> >> >> >> dat2 <- data.frame(Week=1:2, from= c("9-Dec-12", "16-Dec-12"), >>to=c("15-Dec-12","22-Dec-12"),stringsAsFactors=FALSE) >> >> #Check ?findInterval() >> >> res <- t(sapply(split(dat,dat$GroupID), function(x) { >> x$Date <-as.Date(x$Date,"%d-%b-%y") >> >>unsplit(lapply(split(dat2,dat2$Week),function(y) { >> y$from <- as.Date(y$from, "%d-%b-%y") >> y$to <- as.Date(y$to, "%d-%b-%y") >> sum(x$Date > y$from & x$Date <= >>y$to)}), >> dat2$Week) >> })) >> >> colnames(res) <- paste0("Week",1:2) >> res >> # Week1 Week2 >> #1 2 0 >> #2 2 0 >> #3 0 1 >> >> >> A.K. >> >> >> >> >> >> >> >> >> >> >> On Tuesday, October 15, 2013 6:24 PM, Weijia Wang <zelee...@gmail.com> wrote: >> Hi, I have something very interesting: >> >> Say I have this: >> >> GroupID Date >> 1 1 10-Dec-12 >> 2 1 11-Dec-12 >> 3 2 13-Dec-12 >> 4 2 15-Dec-12 >> 5 3 06-Dec-12 >> 6 3 19-Dec-12 >> >> Now, I have time interval, >> >> week 1: from 9-Dec-12 to 15-Dec-12, >> >> week 2: from 16-Dec-12 to 22-Dec-12, and so on. >> >> Obviously, the 1st, 2nd, 3rd, 4th row falls to week 1, 5th rows should not >> be counted, 6th row falls into week2. >> >> Therefore, by GroupID, I will have >> >> GroupID=1, Week1=2, Week2=0 >> GroupID=2, Week1=2, Week2=0 >> GroupID=3, Week1=0, Week2=1. >> >> I just want to count the valid date that falls into a 7-day week interval, >> and I shall have new variables for EACH WEEK, and the counts for dates that >> fall into this week interval. >> >> Can anyone please help me on programming this? >> >> W >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > -- --Sent via Gmail Weijia Wang Division of Epidemiology Department of Population Health, School of Medicine New York University, NY, 10016 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.