On Fri, Jan 13, 2012 at 6:34 AM, doggysaywhat <chwh...@ucsd.edu> wrote: > Hello all. > > I have two data frames. > Group Start End > G1 200 700 > G2 500 1000 > G3 2000 3000 > G4 4000 6000 > G5 7000 8000 > > > and > > Pos C0 C1 > 200 0.9 0.6 > 500 0.8 0.8 > 800 0.9 0.7 > 1000 0.7 0.6 > 2000 0.6 0.4 > 2500 1.2 0.8 > 3000 0.6 1.5 > 3500 0.7 0.7 > 4000 0.8 0.8 > 4500 0.6 0.6 > 5000 0.9 0.9 > 5500 0.7 0.8 > 6000 0.8 0.7 > 6500 0.4 0.4 > 7000 0.5 0.8 > 7500 0.7 0.9 > 8000 0.9 0.5 > 8500 0.8 0.6 > 9000 0.9 0.8 > > > I need to conditionally average all values in columns C0 and C1 based upon > the bins I defined in the first data frame. For example, for the bin G1 in > the first dataframe, the values are 200 to 700 so i would average the value > at pos 200 (0.9) and 500 (0.8) for C0 and then perform the same thing for > C1. > > I can do this in excel with array formulas but I'm relatively new to R and > would like know if there is a function that will perform the same action. I > don't know if this will help, but the excel array function I used was > average(if(range>=start)*(range<=end),range)). Where the range is the > entire pos column. > > Initially I looked at the aggregate function. I can use aggregate when I > give a single vector to be used for grouping such as (A,B,C) but I'm not > sure how to define grouping as the bin 200-500 and the second bin as > 500-1000 etc. and use that as my grouping vector. >
Here is an sqldf solution where the two input data frames are d1 and d2 (as in Jeff's post). Note that Group is quoted since its an SQL keyword: library(sqldf) sqldf("select d1.'Group', avg(d2.C0), avg(d2.C1) from d1, d2 where d2.Pos between d1.Start and d1.End group by d1.'Group'") The result is; Group avg(d2.C0) avg(d2.C1) 1 G1 0.85 0.7000000 2 G2 0.80 0.7000000 3 G3 0.80 0.9000000 4 G4 0.76 0.7600000 5 G5 0.70 0.7333333 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.