Sean Baumgarten wrote on 12/14/2011 06:38:08 PM: > Hello, > > I have a data frame with hourly or sub-hourly weather records that span > several years, and from that data frame I'm trying to select only the > records taken closest to noon for each day. Here's what I've done so far: > > #Add a column to the data frame showing the difference between noon and the > observation time (I converted time to a 0-1 scale so 0.5 represents noon): > data$Diff_from_noon <- abs(0.5-data$Time) > > #Find the minimum value of "Diff_from_noon" for each Date: > aggregated <- aggregate(Diff_from_noon ~ Date, data, FUN=min) > > > The problem is that the "aggregated" data frame only has two columns: Date > and Diff_from_noon. I can't figure out how to get the columns with the > actual weather variables to carry over from the original data frame. > > Any suggestions you have would be much appreciated. > > Thanks, > Sean
You don't provide any example data, so I will use data from R datasets, airquality. After using the aggregate() function to find the minimum Day for each Month, merge the resulting data frame with the original data frame to see all the columns corresponding to the selected minimums. > aggregated <- aggregate(Day ~ Month, airquality, FUN=min) > aggregated Month Day 1 5 1 2 6 1 3 7 1 4 8 1 5 9 1 > merge(aggregated, airquality) Month Day Ozone Solar.R Wind Temp 1 5 1 41 190 7.4 67 2 6 1 NA 286 8.6 78 3 7 1 135 269 4.1 84 4 8 1 39 83 6.9 81 5 9 1 96 167 6.9 91 For your data, the code would look like this: aggregated <- aggregate(Diff_from_noon ~ Date, data, FUN=min) merge(aggregated, data) I recommend that you use a name other than "data" for your data frame, since data() is a built in R function. Jean [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.