On Fri, Apr 23, 2010 at 4:05 AM, chrisli1223 <chri...@austwaterenv.com.au> wrote: > > Hi all, > > I have a dataset similar to the following > > Name Date Value > A 1/01/2000 4 > A 2/01/2000 4 > A 3/01/2000 5 > A 4/01/2000 4 > A 5/01/2000 1 > B 6/01/2000 2 > B 7/01/2000 1 > B 8/01/2000 1 > > I would like R to remove duplicates based on column 1 and 3 only. In > addition, I would like R to remove duplicates based on the underlying and > overlying row only. For example, for A, I would like to remove row 2 only > and keep row 1, 3 and 4. > > I have tried: unique() and replicated(), but I do not have much success. I > have also tried: dataset<-c(1,diff(dataset)!=0), but I don't know how to > apply it to this multi-column situation. > > Any help would be greatly appreciated. > > Thanks in advance, > Chris > --
Hi, This code is a bit ugly, but it works. Hope it helps. /Gustaf library(zoo) test<-read.table("clipboard",header=T) test$code<-paste(test$Name,test$Value,sep="") drop.ndx<-rollapply(zoo(test$code),3,function(x)(x[2]%in%c(x[1],x[3]))) drop.ndx<-c(FALSE,drop.ndx,FALSE) test[!drop.ndx,] -- Gustaf Rydevik, M.Sci. tel: +46(0)703 051 451 address:Essingetorget 40,112 66 Stockholm, SE skype:gustaf_rydevik ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.