On Sat, Feb 11, 2012 at 07:17:54PM +0100, Kai Mx wrote: > Hi everybody, > I have a large dataframe similar to this one: > knames <-c('ab', 'aa', 'ac', 'ad', 'ab', 'ac', 'aa', 'ad','ae', 'af') > kdate <- as.Date( c('20111001', '20111102', '20101001', '20100315', > '20101201', '20110105', '20101001', '20110504', '20110603', '20110201'), > format="%Y%m%d") > kdata <- data.frame (knames, kdate) > I would like to add a new variable to the dataframe counting the > occurrences of different values in knames in their order of appearance > (according to the date as in indicated in kdate). The solution should be a > variable with the values 2,2,1,1,1,2,1,2,1,1. I could do it with a loop, > but there must be a more elegant way to this.
Hi. Is the first 2 in the new variable due to the fact that the name is "ab" and "ab" at row 5 has older date? If so, then try the following ind <- order(kdata$kdate) f <- function(x) seq.int(along.with=x) kdata$x <- ave(1:nrow(kdata), kdata$knames[ind], FUN=f)[order(ind)] knames kdate x 1 ab 2011-10-01 2 2 aa 2011-11-02 2 3 ac 2010-10-01 1 4 ad 2010-03-15 1 5 ab 2010-12-01 1 6 ac 2011-01-05 2 7 aa 2010-10-01 1 8 ad 2011-05-04 2 9 ae 2011-06-03 1 10 af 2011-02-01 1 kdata$knames[ind] orders the names by increasing date. ave(...)[order(ind)] reorders the output of ave() to the original order. Hope this helps. Petr Savicky. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.