I think we need a task view on longitudinal data manipulation. There are so many approaches to this - people need help navigating them.
I tend to stay away from the lapply-split methods as they don't look quite as clean and may take longer to run. The aggregate function uses too much data frame subscripting. The plyr package and the mApply function in the Hmisc package provide some other nice solutions. Often I like to stick with tapply using constructs like with(mydata, tapply(1:nrow(mydata), subjectID, function(i) {... operate on variables in mydata subscripted by [i] ...))) Frank arun kirshna wrote > Hi, > > I am not sure why you are getting different results. I couldn't reproduce > your problem. > dat1<- read.table(text=" > ID COMPL SEX HEREDITY > 1 0 1 2 > 1 0 1 2 > 1 3 1 2 > 2 0 0 1 > 2 1 0 1 > 2 2 0 1 > 2 2 0 1 > 3 0 0 1 > 3 0 0 1 > 3 0 0 1 > 3 0 0 1 > 3 2 0 1 > 4 0 1 2 > 4 0 1 2 > ",sep="",header=TRUE) > do.call(rbind,lapply(split(dat1,dat1$ID),function(x) if(any(x$COMPL!=0)) > head(x[x$COMPL!=0,],1) else head(x,1))) > # ID COMPL SEX HEREDITY > #1 1 3 1 2 > #2 2 1 0 1 > #3 3 2 0 1 > #4 4 0 1 2 > > > You could also try: > dat1[with(dat1,ave(COMPL,ID,FUN=function(x) if(any(x!=0)) cumsum(x>0) else > seq_along(x)))==1,] #modification of David's code > # ID COMPL SEX HEREDITY > #3 1 3 1 2 > #5 2 1 0 1 > #12 3 2 0 1 > #13 4 0 1 2 > A.K. > > > > > > ________________________________ > From: Tasnuva Tabassum < > t.tasnuva@ > > > To: arun < > smartpink111@ > > > Sent: Sunday, February 24, 2013 12:08 AM > Subject: Re: [R] Selecting First Incidence from Longitudinal Data > > > sorry, I tried this. But it gave me answer: > > # ID COMPL SEX HEREDITY > #1 1 0 1 2 > #4 2 0 0 1 > #8 3 0 0 1 > #13 4 0 1 2 > > > > > On Sat, Feb 23, 2013 at 8:44 PM, arun < > smartpink111@ > > wrote: > > Hi, >>Try this: >>#dat1 >> do.call(rbind,lapply(split(dat1,dat1$ID),function(x) if(any(x$COMPL!=0)) head(x[x$COMPL!=0,],1) else head(x,1))) >> >># ID COMPL SEX HEREDITY >> >>#1 1 3 1 2 >>#2 2 1 0 1 >>#3 3 2 0 1 >>#4 4 0 1 2 >>A.K. >> >> >> >> >> >> >>________________________________ >>From: Tasnuva Tabassum < > t.tasnuva@ > > >>To: Xiaogang Su < > xiaogangsu@ > > >>Cc: arun < > smartpink111@ > >; R help < > r-help@ > >; Rui Barradas < > ruipbarradas@ > > >>Sent: Saturday, February 23, 2013 11:23 PM >> >>Subject: Re: [R] Selecting First Incidence from Longitudinal Data >> >> >>Hi >>Thank you very much, but I forgot to tell that I also want to include the patients for which no complication occurred. That is, for my data I want to include patient no. 4, for which the COMPL value will be 0. >> >>In that case, what R function should I write? >> >> >> >> >>On Sat, Feb 23, 2013 at 12:23 PM, Xiaogang Su < > xiaogangsu@ > > wrote: >> >>My bad. I didn't try it out with the real data. Here you go. HTH, X >>> >>> >>>dat <- read.table(text=" >>>ID COMPL SEX HEREDITY >>>1 0 1 2 >>>1 0 1 2 >>>1 3 1 2 >>>2 0 0 1 >>>2 1 0 1 >>>2 2 0 1 >>>2 2 0 1 >>>3 0 0 1 >>>3 0 0 1 >>>3 0 0 1 >>>3 0 0 1 >>>3 2 0 1 >>>4 0 1 2 >>>4 0 1 2 >>>", header = TRUE) >>> >>> >>>dat0 <- dat[dat$COMPL!=0, ] >>>dat0$sequence <- as.vector(unlist(lapply(aggregate(dat0$ID, by=list(dat0$ID),FUN=length)$x, FUN=function(x){seq(1, x)}))) >>>dat0 <- dat0[dat0$sequence==1, ] >>>dat0 >>> >>> >>> >>> >>>On Sat, Feb 23, 2013 at 2:09 PM, arun < > smartpink111@ > > wrote: >>> >>>HI, >>>>Tried your approach: >>>> >>>> >>>> dat1$sequence <- as.vector(unlist(lapply( aggregate(dat1$ID, by=list(dat1$ID),FUN=length)$x, FUN=function(x){seq(1, x)}))) >>>> dat0 <- dat1[dat1$sequence==1 & dat1$COMPL!= 0, ] #your second solution >>>> dat0 >>>>#[1] ID COMPL SEX HEREDITY sequence >>>>#<0 rows> (or 0-length row.names) >>>> >>>> >>>>dat1[dat1$sequence==1,] #here the OP wanted first incidence where COMPL!=0 >>>># ID COMPL SEX HEREDITY sequence >>>>#1 1 0 1 2 1 >>>>#4 2 0 0 1 1 >>>>#8 3 0 0 1 1 >>>>#13 4 0 1 2 1 >>>>A.K. >>>> >>>> >>>> >>>> >>>>----- Original Message ----- >>>>From: Xiaogang Su < > xiaogangsu@ > > >>>>To: Rui Barradas < > ruipbarradas@ > > >>>>Cc: > r-help@ >>>>Sent: Saturday, February 23, 2013 2:15 PM >>>>Subject: Re: [R] Selecting First Incidence from Longitudinal Data >>>> >>>>Try this: >>>>dat$sequence <- as.vector(unlist(lapply( aggregate(dat$ID, by=list(x), >>>>FUN=length)$x, FUN=function(x){seq(1, x)))) >>>>dat0 <- dat[dat$sequence==1, ] >>>> >>>>HTH, X >>>> >>>> >>>>On Sat, Feb 23, 2013 at 1:07 PM, Rui Barradas < > ruipbarradas@ > > wrote: >>>> >>>>> Hello, >>>>> >>>>> You can use ?aggregate and ?head to do what you want. Try the >>>>> following. >>>>> >>>>> >>>>> >>>>> dat <- read.table(text=" >>>>> >>>>> ID COMPL SEX HEREDITY >>>>> 1 0 1 2 >>>>> 1 0 1 2 >>>>> 1 3 1 2 >>>>> 2 0 0 1 >>>>> 2 1 0 1 >>>>> 2 2 0 1 >>>>> 2 2 0 1 >>>>> 3 0 0 1 >>>>> 3 0 0 1 >>>>> 3 0 0 1 >>>>> 3 0 0 1 >>>>> 3 2 0 1 >>>>> 4 0 1 2 >>>>> 4 0 1 2 >>>>> ", header = TRUE) >>>>> >>>>> aggregate(. ~ ID, data = subset(dat, COMPL != 0), head, 1) >>>>> >>>>> >>>>> Hope this helps, >>>>> >>>>> Rui Barradas >>>>> >>>>> Em 23-02-2013 14:28, Tasnuva Tabassum escreveu: >>>>> >>>>> I have a longitudinal competing risk data of the form: >>>>>> >>>>>> ID COMPL SEX HEREDITY >>>>>> 1 0 1 2 >>>>>> 1 0 1 2 >>>>>> 1 3 1 2 >>>>>> 2 0 0 1 >>>>>> 2 1 0 1 >>>>>> 2 2 0 1 >>>>>> 2 2 0 1 >>>>>> 3 0 0 1 >>>>>> 3 0 0 1 >>>>>> 3 0 0 1 >>>>>> 3 0 0 1 >>>>>> 3 2 0 1 >>>>>> 4 0 1 2 >>>>>> 4 0 1 2. >>>>>> >>>>>> Where, COMPL= health complication of diabetic patients which has >>>>>> value >>>>>> labels as 0= no complication,1=coronary heart disease, >>>>>> 2=retinopathy, >>>>>> 3= >>>>>> nephropathy. >>>>>> >>>>>> >>>>>> I want to select only the first complication that occurred to each >>>>>> patient. >>>>>> What R function can I use? >>>>>> >>>>>> [[alternative HTML version deleted]] >>>>>> >>>>>> ______________________________**________________ >>>>>> > R-help@ > mailing list >>>>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> >>>>>> PLEASE do read the posting guide http://www.R-project.org/** >>>>>> posting-guide.html >>>>>> <http://www.R-project.org/posting-guide.html> >>>> >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>> >>>>>> >>>>> ______________________________**________________ >>>>> > R-help@ > mailing list >>>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> >>>>> PLEASE do read the posting guide http://www.R-project.org/** >>>>> posting-guide.html <http://www.R-project.org/posting-guide.html> >>>> >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> >>>> >>>> >>>>-- >>>>============================== >>>>Xiaogang Su, Ph.D. >>>>Associate Professor & Statistician >>>>School of Nursing, University of Alabama >>>>Birmingham, AL 35294-1210 >>>>(205) 934-2355 [Office] >>>> > xgsu@ >>>> > xiaogangsu@ >>>>https://sites.google.com/site/xgsu00/ >>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>>______________________________________________ >>>> > R-help@ > mailing list >>>>https://stat.ethz.ch/mailman/listinfo/r-help >>>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>and provide commented, minimal, self-contained, reproducible code. >>>> >>>> >>> >>> >>> >>>-- >>>============================== >>>Xiaogang Su, Ph.D. >>>Associate Professor & Statistician >>>School of Nursing, University of Alabama >>>Birmingham, AL 35294-1210 >>>(205) 934-2355 [Office] >>> > xgsu@ >>> > xiaogangsu@ > >>>https://sites.google.com/site/xgsu00/ >> > > ______________________________________________ > R-help@ > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Selecting-First-Incidence-from-Longitudinal-Data-tp4659455p4659530.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.