Right you are, A.K. Thanks for catching that. It should be ... data2 <- data.frame( id = rep(data1$ID, 3), visit = rep(1:3, rep(dim(data1)[1], 3)), date = as.Date(c(data1$V1Date, data1$V2date, data1$V3date), "%m/%d/%y"), dva = c(data1$V1a, data1$V2a, data1$V3a), dvb = c(data1$V1b, data1$V2b, data1$V3b), dvc = c(data1$V1c, data1$V2c, data1$V3c))
Jean On Mon, Dec 17, 2012 at 5:08 PM, arun <smartpink...@yahoo.com> wrote: > Hi Jean, > > Just to clarify whether it is a 'typo' or not. > data2 <- data.frame( > id = rep(data1$ID, 3), > visit = rep(1:3, rep(dim(data1)[1], 3)), > date = as.Date(c(data1$V1Date, data1$V2date, data1$V3date), "%m/%d/%y"), > dva = c(data1$V1a, data1$V2a, data1$V3a), > dvb = c(data1$V1a, data1$V2a, data1$V3a),# 'b' > dvc = c(data1$V1a, data1$V2a, data1$V3a)) # 'c' > > > A.K. > > > > ----- Original Message ----- > From: "Adams, Jean" <jvad...@usgs.gov> > To: marcel curlin <marcelcur...@gmail.com> > Cc: r-help@r-project.org > Sent: Monday, December 17, 2012 5:29 PM > Subject: Re: [R] Manipulation of longitudinal data by row > > I had some difficulty getting the data read in using the code you included > in your email, although I'm not sure why. I'm pasting in the code that > worked for me, below. > > I think that the calculations that you want to make would be easier if you > rearranged your data first. I used your example data to do just that. > Once the data are rearranged, it is very easy to look at information on > the last visit from each ID (see code, below). This includes much of the > information you describe in your query, 1) date of last completed visit 2) > whether an ID resolved, and 3) what the final pattern was. > > Jean > > tC <- textConnection("ID V1Date V1a V1b V1c V2date V2a V2b V2c V3date V3a > V3b V3c > 001 4/5/12 Yes Yes No 6/18/12 Yes No Yes NA NA NA NA > 002 1/22/12 No No Yes 7/5/12 Yes No Yes NA NA NA NA > 003 4/5/12 Yes No No 9/4/12 Yes No Yes 11/1/12 Yes No Yes > 004 8/18/12 Yes Yes Yes 9/22/12 Yes No Yes NA NA NA NA > 005 9/6/12 Yes No No NA NA NA NA 12/4/12 Yes No Yes") > data1 <- read.table(header=TRUE, tC) > close.connection(tC) > rm(tC) > > # rearrange the data > data2 <- data.frame( > id = rep(data1$ID, 3), > visit = rep(1:3, rep(dim(data1)[1], 3)), > date = as.Date(c(data1$V1Date, data1$V2date, data1$V3date), "%m/%d/%y"), > dva = c(data1$V1a, data1$V2a, data1$V3a), > dvb = c(data1$V1a, data1$V2a, data1$V3a), > dvc = c(data1$V1a, data1$V2a, data1$V3a)) > # define a new variable that is a combination of the three dichotomous > variables > data2$abc <- paste0(substring(data2$dva, 1, 1), substring(data2$dvb, 1, 1), > substring(data2$dvb, 1, 1)) > # define a new variable that indicates whether the combination is "normal" > data2$normal <- data2$abc %in% c("YYN", "YNY", "YYN", "NNY") > > # eliminate rows without visit information > data3 <- data2[!is.na(data2$date), ] > # split the data into lists according to id > list4 <- split(data3, data3$id) > > # show the last visit from each id > do.call(rbind, lapply(list4, function(df) df[dim(df)[1], ])) > > > > On Fri, Dec 14, 2012 at 10:37 AM, marcel curlin <marcelcur...@gmail.com > >wrote: > > > I have a dataset of the form below, consisting of one unique ID per > > row, followed by a series of visit dates. At each visit there are > > values for 3 dichotomous variables. Of the 8 different possible > > combinations of the three variables, 4 are "abnormal" and the > > remaining 4 are "normal". Everyone starts out abnormal, and then > > either continues to be abnormal at subsequent visits, or resolves to a > > normal pattern at a later visit (I ignore reversion back to abnormal - > > once they are normal, they are normal) > > > > I have to end up with 4 new columns indicating 1) date of last > > completed visit (regardless of intervening "NAs", 2) whether an ID > > resolved or stayed abnormal, 3) if resolved, what the resolution > > pattern was and 4) what the date of resolution was. NAs always come in > > groups of 4 (ie no visit date, and no value for the 3 variables) and > > are ignored. > > > > Eventually I have to determine mean time to resolution, mean follow-up > > time, etc and I think I can do that, but the first part is a bit > > beyond my coding skill. Suggestions appreciated. > > > > tC <- textConnection(" > > ID V1Date V1a V1b V1c V2date V2a V2b V2c V3date V3a V3b V3c > > 001 4/5/12 Yes Yes No 6/18/12 Yes No Yes NA NA NA NA > > 002 1/22/12 No No Yes 7/5/12 Yes No Yes NA NA NA NA > > 003 4/5/12 Yes No No 9/4/12 Yes No Yes 11/1/12 Yes No Yes > > 004 8/18/12 Yes Yes Yes 9/22/12 Yes No Yes NA NA NA NA > > 005 9/6/12 Yes No No NA NA NA NA 12/4/12 Yes No Yes > > ") > > data1 <- read.table(header=TRUE, tC) > > close.connection(tC) > > rm(tC) > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.