Hello, I have a question about working with dates in R. I would like to summarize a response variable based on a designated and irregular time period. The purpose of this is to compare the summarized values (which were sampled daily) to another variable that was sampled less frequently. Below is a trivial example where I would like to summarize the response variable dat$x such that I have average and sum values from Sept25-27 and Sept28-Oct1. Can anyone suggest an efficient way to deal with dates like this? As an extremely tedious previous effort, I simply created another grouping variable but I had to do this manually. For a large dataset this really isn't a good option.
Thanks in advance! Sam library(plyr) dat <- data.frame(x = runif(6, 0, 125), date = as.Date(c("2009-09-25","2009-09-26","2009-09-27","2009-09-28","2009-09-29","2009-09-30","2009-10-01"), format="%Y-%m-%d"), yy = letters[1:2], stringsAsFactors = TRUE) #If I was using a regular factor, I would do something like this and this is what I would be hoping for as a result (obviously switching yy for date as the grouping variable) ddply(dat, c("yy"), function(df) return(c(avg=mean(df$x), sum=sum(df$x)))) #This is the data.frame that I would like to compare to dat. dat2 <- data.frame(y = runif(2, 0, 125), date = as.Date(c("2009-09-27","2009-10-01"), format="%Y-%m-%d")) -- ***************************************************** Sam Albers Geography Program University of Northern British Columbia 3333 University Way Prince George, British Columbia Canada, V2N 4Z9 phone: 250 960-6777 ***************************************************** [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.