Or ?rle Bert
Sent from my iPhone -- please excuse typos. > On Aug 4, 2014, at 8:28 AM, jim holtman <jholt...@gmail.com> wrote: > > Try this, but I only get 2 changes for CB27A instead of you indicated 3: > >> require(data.table) >> x <- read.table(text = "CASE_ID YEAR_MTH ATT_1 > + CB26A 201302 1 > + CB26A 201302 0 > + CB26A 201302 0 > + CB26A 201303 1 > + CB26A 201303 1 > + CB26A 201304 0 > + CB26A 201305 1 > + CB26A 201305 0 > + CB26A 201306 1 > + CB27A 201304 0 > + CB27A 201304 0 > + CB27A 201305 1 > + CB27A 201306 1 > + CB27A 201306 0 > + CB27A 201307 0 > + CB27A 201308 1", header = TRUE, as.is = TRUE) >> setDT(x) >> # convert to a Date object for comparison >> x[, MYD := as.Date(paste0(YEAR_MTH, '01'), format = "%Y%m%d")] >> # separate by CASE_ID and only keep the first 3 months >> x[ > + , { > + # determine the end date as 3 months from the first date > + endDate <- seq(MYD[1L], by = '3 months', length = 2)[2L] > + # extract what is changing > + changes <- ATT_1[(MYD >= MYD[1L]) & (MYD <= endDate)] > + # now count the changes > + list(nChanges = sum(head(changes, -1L) != tail(changes, -1L))) > + } > + , by = CASE_ID > + ] > CASE_ID nChanges > 1: CB26A 5 > 2: CB27A 2 > > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. > > >> On Wed, Jul 30, 2014 at 3:08 AM, Abhinaba Roy <abhinabaro...@gmail.com> >> wrote: >> Dear R-helpers, >> >> I want to count the number of times ATT_1 has changed in a period of 3 >> months(can be 4months) from the first YEAR_MTH entry for a CASE_ID. So if >> for a CASE_ID we have data only for two distinct YEAR_MTH, then all the >> entries should be considered, otherwise only the relevant entries will be >> considered for calculation. >> E.g. if the first YEAR_MTH entry is 201304 then get the number of changes >> till 201307(inclusive), similarly if the first YEAR_MTH entry is 201302 >> then get the number of changes till 201305. >> >> Dataset >> CASE_ID YEAR_MTH ATT_1 >> CB26A 201302 1 >> CB26A 201302 0 >> CB26A 201302 0 >> CB26A 201303 1 >> CB26A 201303 1 >> CB26A 201304 0 >> CB26A 201305 1 >> CB26A 201305 0 >> CB26A 201306 1 >> CB27A 201304 0 >> CB27A 201304 0 >> CB27A 201305 1 >> CB27A 201306 1 >> CB27A 201306 0 >> CB27A 201307 0 >> CB27A 201308 1 >> >> The final dataset should look like >> >> ID_CASE No.of changes >> CB26A 5 >> CB27A 3 >> >> where 'No.of changes' refer to the change in 3 months (201302-201305 for >> CB26A and 201304-201307 for CB27A). >> >> How can this be done in R? >> >> Regards, >> Abhinaba Roy >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.