Thanks to everyone. Joshua's response seemed the most concise one, but it used up so much memory that my R just gave error. I checked the other replies and all in all I came up with this, and thought to share it with others and get comments.
My structure was as follows: ACCOUNT RULE DATE A1 xxxx 2010-01-01 A2 xxxx 2007-05-01 A2 xxxx 2007-05-01 A2 xxxx 2005-05-01 A2 xxxx 2005-05-01 A1 xxxx 2009-01-01 The most efficient solution I came across involves the following steps: 1. Find the latest date for each account, and convert it to a data frame: a<-tapply(my.mapping$DATE,my.mapping$ACCOUNT,max) a<-data.frame(ACCOUNT=names(a),DT=as.Date(a,"%Y-%m-%d")) 2. merge the set with the original data my.mapping<-merge(x=my.mapping,y=a,by.x="ACCOUNT",by.y="ACCOUNT") 3. Create a take column, which is to confirm if the date of the row is the maximum date for the account. my.mapping<-cbind(my.mapping,TAKE=my.mapping$DATE==my.mapping$DT) 4. Filter out all lines except those with TAKE==TRUE. my.mapping<-my.mapping[my.mapping$TAKE==TRUE,] The running time for my whole list was 4.5 sec which is far better than any other ways I tried. Let me have your thoughts on that. Ali [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.