Hi Andre, I've taken a different approach to that employed by Eric: A<-data.frame(c("01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020", "01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020", "01/01/2020","01/02/2020","01/02/2020","01/02/2020","01/02/2020","01/03/2020", "01/03/2020","01/03/2020","01/03/2020","01/03/2020","01/03/2020","01/03/2020", "01/04/2020","01/04/2020","01/04/2020","01/04/2020","01/04/2020", "01/04/2020","01/04/2020","01/04/2020","01/04/2020"), c(23,22,12,24,26,19,34,15,17,19,23,33,23,34,25,23,25,24,34,33,31,32,24,22,21, 23,22,22,21,23,23,21), c(13,11,12,9,8,9,7,10,11,9,6,11,9,8,9,10,11,12,9,8,10,4,6,9,8,9,10,11,14,12, 13,11), c(1,2,3,4,5,6,7,8,9,10,11,12,1,2,3,4,1,2, 3,4,5,6,7,1,2,3,4,5,6,7,8,9)) colnames(A) <- c("Date", "CO2", "CH4", "ID") # add a variable to compile selected rows A$select<-FALSE # get all unique dates alldates<-unique(A$Date) for(date in alldates) { # get indices for this date date_indices<-which(A$Date == date) # only mark the first 8 as TRUE A$select[date_indices[1:8]]<-all(1:8 %in% A$ID[date_indices]) } A A[A$select,]
If you don't want to add a column you can set up "select" as a vector. Jim On Mon, Jun 21, 2021 at 6:18 PM Eric Berger <ericjber...@gmail.com> wrote: > > Hi André, > It's not 100% clear to me what you are asking. I am interpreting the > question as selecting the data from those dates for which all of > 1,2,3,4,5,6,7,8 appear in the ID column. > My approach determines the dates satisfying this property, which I put into > a vector dtV. Then I take the rows of A for which the date is in the vector > dtV. > > library(dplyr) > dtV <- A %>% mutate(x=2^(ID-1)) %>% group_by(Date) %>% > summarise(y=(sum(unique(x))%%256==255)) %>% filter(y==TRUE) %>% select(Date) > B <- A[ A$Date %in% dtV$Date, ] > > B is the subset of A that you want. > > HTH, > Eric > > > > On Mon, Jun 21, 2021 at 10:23 AM André Luis Neves <andrl...@ualberta.ca> > wrote: > > > Dear R users, > > > > I want to select only the data containing a continuous number of *ID* from > > 1-8 in each *DATE*. Note, I do not want to select data that do not contain > > a continuous number in *ID *from 1-8 (eg. Data on *DATE* 1/2/2020, and > > 01/03/2020). The dataset is a huge matrix with 24 columns and 1.5 million > > rows, but I have prepared a reproducible code for your reference below. > > > > Here it is the reproducible code: > > > > A = > > > > data.frame(c("01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020", > > > > > > > > "01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/02/2020","01/02/2020", > > > > > > > > "01/02/2020","01/02/2020","01/03/2020","01/03/2020","01/03/2020","01/03/2020","01/03/2020", > > > > > > > > "01/03/2020","01/03/2020","01/04/2020","01/04/2020","01/04/2020","01/04/2020","01/04/2020", > > "01/04/2020","01/04/2020","01/04/2020","01/04/2020"), > > c(23,22,12,24,26,19,34,15,17,19,23,33, > > > > 23,34,25,23,25,24,34,33,31,32,24,22,21,23,22,22,21,23,23,21), > > c(13,11,12,9,8,9,7,10,11,9,6,11, > > 9,8,9,10,11,12,9,8,10,4,6,9,8,9,10,11,14,12,13,11), > > c(1,2,3,4,5,6,7,8,9,10,11,12,1,2,3,4,1,2, > > 3,4,5,6,7,1,2,3,4,5,6,7,8,9)) > > colnames(A) <- c("Date", "CO2", "CH4", "ID") > > A > > > > Thank you, > > -- > > Andre > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.