Jim Mankin liked your message with Boxer. On April 18, 2015 at 10:48:17 AM MST, 
Charles C. Berry <ccbe...@ucsd.edu> wrote:On Sat, 18 Apr 2015, Brant Inman 
wrote:> I have two large data frames with the following structure:>>> df1> id 
date test1.result> 1 a 2009-08-28 1> 2 a 2009-09-16 1> 3 b 2008-08-06 0> 4 c 
2012-02-02 1> 5 c 2010-08-03 1> 6 c 2012-08-02 0>>> df2> id date test2.result> 
1 a 2011-02-03 1> 2 b 2011-09-27 0> 3 b 2011-09-01 1> 4 c 2009-07-16 0> 5 c 
2009-04-15 0> 6 c 2010-08-10 1>> I need to match items in df2 to those in df1 
with specific matching > criteria. I have written a looped matching algorithm 
that works, but it > is very slow with my large datasets. I am requesting help 
on making a > version of this code that is faster and “vectorized" so to 
speak.As I see in your posted code, you match id's exactly, dates according to 
a range, and count the number of positive test result in the second 
data.frame.For this, the countOverlaps() function of the GenomicRanges package 
will do the trick with suitably defined GRanges objects. Something 
like:require(GenomicRanges)date1 date2 lagdays predays gr1 gr2  
IRanges(start=date2+predays,end=date2+lagdays), strand="*")[ 
df2$test2.result==1,]df1$test2.count For the example data.frames (as rendered 
by Jim Lemon's code), this yields> df1 id date test1.result test2.count1 a 
2009-08-28 1 02 a 2009-09-16 1 03 b 2008-08-06 0 04 c 2012-02-02 1 05 c 
2010-08-03 1 16 c 2012-08-02 0 0The GenomicRanges package is 
athttp://www.bioconductor.org/packages/release/bioc/html/GenomicRanges.htmlwhere
 you will find installation instructions and links to 
vignettes.HTH,chuck______________________________________________r-h...@r-project.org
 mailing list -- To UNSUBSCRIBE and more, 
seehttps://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide 
http://www.R-project.org/posting-guide.htmland provide commented, minimal, 
self-contained, reproducible code.     
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to