Re: [R] Calculate average of many subsets based on columns in another dataframe

2016-02-11 Thread Peter Lomas
Thanks to everybody for trying to help me with this, I think there are a few workable options here. However, I think the most efficient option that I've found was to avoid the join/aggregate in R altogether. I've joined them at the database level to accomplish the same thing. This may not be a h

Re: [R] Calculate average of many subsets based on columns in another dataframe

2016-02-10 Thread Bert Gunter
Oh, you didn't say the intervals could overlap! If Bill D's suggestions don't suffice, try the following: (again assuming all dates are in a form that allow comparison operations, e.g. via as.POSIX**) Assume you have g intervals with start dates "starts" and end dates "ends" and that you have d

Re: [R] Calculate average of many subsets based on columns in another dataframe

2016-02-10 Thread William Dunlap via R-help
You could try pulling some of the repeated subscripting operations, especially the insertions, out of the loop. E.g., values <- observations[,"values"]; date <- observations[,"date"] ; groups$average <- vapply(seq_len(NROW(groups)), function(i) mean(values[date >= groups[i, "start"] &

Re: [R] Calculate average of many subsets based on columns in another dataframe

2016-02-10 Thread Peter Lomas
Thanks David, Bert, >From what I'm reading on ?findInterval, It may not be workable because of overlapping date ranges. findInterval seems to take a series of bin breakpoints as its argument. I'm currently exploring data.table documentation and will keep thinking about this. Just on David's poin

Re: [R] Calculate average of many subsets based on columns in another dataframe

2016-02-10 Thread Bert Gunter
A strategy: 1. Convert your dates and intervals to numerics that give the days since a time origin. See as.POSIXlt (or ** ct for details and an example that does this). Should be fast... 2. Use the findInterval() function to get the interval into which each date falls. This **is** "vectorized" a

Re: [R] Calculate average of many subsets based on columns in another dataframe

2016-02-10 Thread David Winsemius
> On Feb 10, 2016, at 12:18 PM, Peter Lomas wrote: > > Hello, I have a dataframe with a date range, and another dataframe > with observations by date. For each date range, I'd like to average > the values within that range from the other dataframe. I've provided > code below doing what I would