Hi, May be you can try: ###Use dput()
dat1 <- structure(list(trialno = c(11301L, 11301L, 11301L, 11301L, 11301L, 11301L, 11301L, 11301L, 11301L, 11301L, 11302L, 11302L, 11302L, 11302L, 11302L, 11302L, 11302L, 11302L, 11302L, 11302L), event = c("pm_intake", "am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pk2", "pm_intake", "am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pk2", "pm_intake", "am_intake", "pk1"), date = c("2010-11-24", "2010-11-25", "2010-11-25", "2010-12-22", "2010-12-23", "2010-12-23", "2010-12-23", "2011-02-02", "2011-02-03", "2011-02-03", "2010-11-24", "2010-11-25", "2010-11-25", "2010-12-22", "2010-12-23", "2010-12-23", "2010-12-23", "2011-02-02", "2011-02-03", "2011-02-03"), time = c("19:00", "07:00", "10:30", "19:00", "07:00", "09:54", "13:07", "19:00", "07:00", "11:30", "19:00", "07:00", "10:30", "19:00", "07:00", "09:54", "13:07", "19:00", "07:00", "11:30")), .Names = c("trialno", "event", "date", "time"), class = "data.frame", row.names = c("3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22")) splitData<- split(dat1, dat1$trialno) #using your code res <- unsplit(lapply(splitData,function(x) within(x,OCC <- cumsum(ave(seq_along(date),date,FUN=seq_along)==1))),dat1$trialno) res$OCC #[1] 1 2 2 3 4 4 4 5 6 6 1 2 2 3 4 4 4 5 6 6 #or within(dat1,OCC <- as.numeric(ave(date,trialno,FUN= function(x) cumsum(ave(seq_along(x),x,FUN=seq_along)==1)))) A.K. On Thursday, November 21, 2013 2:04 PM, Andrzej Bienczak <andrzej.bienc...@googlemail.com> wrote: Hi All, I'm trying to figure out how in my data set to add a column including a count of unique events based on date. Here is a part of my data set: trialno event date time 3 11301 pm_intake 2010-11-24 19:00 4 11301 am_intake 2010-11-25 07:00 5 11301 pk1 2010-11-25 10:30 6 11301 pm_intake 2010-12-22 19:00 7 11301 am_intake 2010-12-23 07:00 8 11301 pk1 2010-12-23 09:54 9 11301 pk2 2010-12-23 13:07 10 11301 pm_intake 2011-02-02 19:00 11 11301 am_intake 2011-02-03 07:00 12 11301 pk1 2011-02-03 11:30 Basically each date within each patient would indicate a new occasion. If patient has just drug administration - it's one occasion but if patient had drug administration and two measurements on the same day, they all count as the same occasion. The data set does not have a regular patters (each patient has a different number of events on each date and events in total). What I'm trying to achieve is: trialno event date time OCC 3 11301 pm_intake 2010-11-24 19:00 1 4 11301 am_intake 2010-11-25 07:00 2 5 11301 pk1 2010-11-25 10:30 2 6 11301 pm_intake 2010-12-22 19:00 3 7 11301 am_intake 2010-12-23 07:00 4 8 11301 pk1 2010-12-23 09:54 4 9 11301 pk2 2010-12-23 13:07 4 10 11301 pm_intake 2011-02-02 19:00 5 11 11301 am_intake 2011-02-03 07:00 6 12 11301 pk1 2011-02-03 11:30 6 I think I should apply some kind of a loop to identify within each patient unique dates and count them... I thought about splitting the whole data set into patients using split function: splitData<- split(data, data$trialno) And applying lapply and transform to add a new column OCC (occasion) but I don't know how to count those as integers... I was thinking: splitData<- lapply(splitData, function(df) { transform(df, OCC= ??????????????? )} do.call ("rbind", splitData) I know how to do it in Excell: =IF(D5=D4, E4,E4+1) (if the cell value in neighbouring cell is same as in the cell above, then value in my cell is same as in one above, else it's one greater)-this way first cell in E column has to be 1 and the others are integers of new date events. Help much appreciated! Andrzej [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.