Hi Jean,

Thanks for the help. I couldn't quite get the results I needed with the
merge command, but I ended up using the following work-around:

Weather <- read.csv("Weather.csv")
Weather$diff.time <- abs(.5 - Weather$TimeNumeric)
agg <- aggregate(diff.time ~ Date, data = Weather, FUN = which.min)
n.obs <- cumsum(rle(as.double(Weather$Date))$lengths)
n.obs <- c(0, n.obs[1:(length(n.obs) - 1)])
noon.ind <- agg$diff.time + n.obs
subset <- Weather[noon.ind,]

Cheers,
Sean

On Mon, Dec 19, 2011 at 6:03 AM, Jean V Adams <jvad...@usgs.gov> wrote:

>
> Sean Baumgarten wrote on 12/14/2011 06:38:08 PM:
>
> > Hello,
> >
> > I have a data frame with hourly or sub-hourly weather records that span
> > several years, and from that data frame I'm trying to select only the
> > records taken closest to noon for each day. Here's what I've done so far:
> >
> > #Add a column to the data frame showing the difference between noon and
> the
> > observation time (I converted time to a 0-1 scale so 0.5 represents
> noon):
> > data$Diff_from_noon <- abs(0.5-data$Time)
> >
> > #Find the minimum value of "Diff_from_noon" for each Date:
> > aggregated <- aggregate(Diff_from_noon ~ Date, data, FUN=min)
> >
> >
> > The problem is that the "aggregated" data frame only has two columns:
> Date
> > and Diff_from_noon. I can't figure out how to get the columns with the
> > actual weather variables to carry over from the original data frame.
> >
> > Any suggestions you have would be much appreciated.
> >
> > Thanks,
> > Sean
>
>
> You don't provide any example data, so I will use data from R datasets,
> airquality.  After using the aggregate() function to find the minimum Day
> for each Month, merge the resulting data frame with the original data frame
> to see all the columns corresponding to the selected minimums.
>
> > aggregated <- aggregate(Day ~ Month, airquality, FUN=min)
> > aggregated
>   Month Day
> 1     5   1
> 2     6   1
> 3     7   1
> 4     8   1
> 5     9   1
> > merge(aggregated, airquality)
>   Month Day Ozone Solar.R Wind Temp
> 1     5   1    41     190  7.4   67
> 2     6   1    NA     286  8.6   78
> 3     7   1   135     269  4.1   84
> 4     8   1    39      83  6.9   81
> 5     9   1    96     167  6.9   91
>
> For your data, the code would look like this:
> aggregated <- aggregate(Diff_from_noon ~ Date, data, FUN=min)
> merge(aggregated, data)
>
> I recommend that you use a name other than "data" for your data frame,
> since data() is a built in R function.
>
> Jean

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to