I have a large data set where one of the columns needs be a unique identifier (ID) for each row. However for a few of the rows they have the same ID. What I need to do is randomly draw one of the rows and keep it in the data frame and drop all the others which have the same ID.
For example: v1 <- c(1,2,3,4,5,6,7) v2 <- c(10,20,30,40,50,60,70) ID <- c("A","A","B","B","C","D","E") DF <- data.frame(v1,v2,ID) But I only need one of the A rows and one of the B rows in the data frame. I tried making ID a factor and using apply() to randomly draw one but I could not get it to work. Any ideas would be greatly appreciated. Thanks, EG ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.