> > Is there any similar function in R to the first. in SAS? > ?duplicated > > a$d <- ifelse( duplicated( a$a ), 0 , 1 ) > > a$d.2 <- as.numeric( !duplicated( a$a ) )
Actually, duplicated does not duplicate SAS' first. operator, though it may suffice for the OP's needs. To illustrate, let's start with a dataframe of 3 key columns and some data in x: tt <- data.frame(k1 = rep(1:3, each=10), k2 = rep(1:5, each=2, times=3), k3=rep(1:2, times=15), x = 1:30) # Try to mimic what the following SAS datastep would do, # assuming 'tt' is already sorted: # data foo; # set tt; # by k1, k2; # put first.k1=, first.k2=; # run; # SAS' first. operations would result in these values: tt$sas.first.k1 <- rep(c(1, rep(0,9)), 3) tt$sas.first.k2 <- rep(1:0, 15) # R duplicated() returns these values. You can see they # are the same for k1, but dissimilar after row 10 for k2. tt$duplicated.k1 <- 0+!duplicated(tt$k1) tt$duplicated.k2 <- 0+!duplicated(tt$k2) # I've found I need to lag a column to mimic SAS' first. # operator, thusly, though perhaps someone else knows # differently. Note this does not work on unordered # dataframes! lag.k1 <- c(NA, tt$k1[1:(nrow(tt) - 1)]) tt$r.first.k1 <- ifelse(is.na(lag.k1), 1, tt$k1 != lag.k1) lag.k2 <- c(NA, tt$k2[1:(nrow(tt) - 1)]) tt$r.first.k2 <- ifelse(is.na(lag.k2), 1, tt$k2 != lag.k2) Mimicking SAS' last. operation can be done in a similar manner, by anti-laging the column of interest and changing the comparisons somewhat. Enjoy the days, cur -- Curt Seeliger, Data Ranger Raytheon Information Services - Contractor to ORD seeliger.c...@epa.gov 541/754-4638 [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.