Re: [R] simplify a dataframe

2013-07-17 Thread Arnaud Michel
#4 02/02/1995 12/03/1995 #1 13/03/1995 30/06/1995 #2 01/01/1996 31/01/1996 A.K. - Original Message - From: arun To: Arnaud Michel Cc: R help ; Rui Barradas Sent: Wednesday, July 17, 2013 4:14 PM Subject: Re: [R] simplify a dataframe Hi, You could try: df1[,1:2]<-lapply

Re: [R] simplify a dataframe

2013-07-17 Thread arun
96 A.K. - Original Message - From: arun To: Arnaud Michel Cc: R help ; Rui Barradas Sent: Wednesday, July 17, 2013 4:14 PM Subject: Re: [R] simplify a dataframe Hi, You could try: df1[,1:2]<-lapply(df1[,1:2],as.character)  df2New<- data.frame(Deb=unique(with(df1,ave(Debut,INDX,FU

Re: [R] simplify a dataframe

2013-07-17 Thread arun
Arnaud Michel To: Rui Barradas ; R help ; arun Cc: Sent: Wednesday, July 17, 2013 4:03 PM Subject: Re: [R] simplify a dataframe   Thank you for the question (1) Sorry for the imprecision for the question (2) : Suppose the date frame df df1 <- data.frame( Debut =c ( "24/01/1995", "

Re: [R] simplify a dataframe

2013-07-17 Thread Arnaud Michel
Thank you for the question (1) Sorry for the imprecision for the question (2) : Suppose the date frame df df1 <- data.frame( Debut =c ( "24/01/1995", "01/05/1997" ,"31/12/1997", "02/02/1995" ,"28/02/1995" ,"01/03/1995", "13/03/1995", "01/01/1996", "31/01/1996") , Fin = c ( "30/04/1997", "30/12/

Re: [R] simplify a dataframe

2013-07-17 Thread Rui Barradas
Hello, As for question (1), try the following. y2 <- cumsum(c(TRUE, diff(x1) > 0)) identical(as.integer(y1), y2) # y1 is of class "numeric" As for question (2) I'm not understanding it. Hope this helps, Rui Barradas Em 17-07-2013 18:21, Arnaud Michel escreveu: Hi Arun I have two questio

Re: [R] simplify a dataframe

2013-07-14 Thread Arnaud Michel
1997 4.0 13/03/1995 30/06/1995 4.1 01/01/1996 31/01/1996 10 02/02/1995 12/03/1995 6 24/01/1995 31/08/1995 9 01/09/1995 29/02/2000 1 26/01/1995 31/08/2001 2 05/09/2012 31/12/2013 3 01/09/2004 31/08/2007 7 01/09/2001 31/08/2004 8 01/09/2007 04/09/2012 A.K. __

Re: [R] simplify a dataframe

2013-07-14 Thread arun
in,1),stringsAsFactors=FALSE)))}))  row.names(res)<- 1:nrow(res)  df2[11,8]<- "31/12/2013"  names(res)[1]<- "Mat"  identical(res,df2) #[1] TRUE A.K. - Original Message - From: arun To: Arnaud Michel Cc: R help Sent: Sunday, July 14, 2013 2:39 PM Subject

Re: [R] simplify a dataframe

2013-07-14 Thread arun
   01/09/1995 29/02/2000 1   26/01/1995 31/08/2001 2   05/09/2012 31/12/2013 3   01/09/2004 31/08/2007 7   01/09/2001 31/08/2004 8   01/09/2007 04/09/2012 A.K. From: Arnaud Michel To: arun Cc: R help ; jholt...@gmail.com; Rui Barradas Sent: Sunday, July 14

Re: [R] simplify a dataframe

2013-07-14 Thread Arnaud Michel
An other remark : If I calculate df1$D<- as.numeric(as.Date(paste0(substr(df1$Debut,7,10),"-", substr(df1$Debut,4,5),"-",substr(df1$Debut,1,2 and df1$F <- as.numeric(as.Date(paste0(substr(df1$Fin,7,10),"-", substr(df1$Fin,4,5),"-",substr(df1$Fin,1,2 and if there is no interruption of time f

Re: [R] simplify a dataframe

2013-07-14 Thread Arnaud Michel
Hi, Excuse me for the indistinctness Le 13/07/2013 17:18, arun a écrit : > Hi, > "when the value of Debut of lines i = value Fin of lines i-1" > That part is not clear esp. when it is looked upon with the expected output > (df2). I want to group the lines which have the same caracteristics (Matric

Re: [R] simplify a dataframe

2013-07-13 Thread arun
Hi, "when the value of Debut of lines i = value Fin of lines i-1" That part is not clear esp. when it is looked upon with the expected output (df2).  Also, in your example dataset: df1$contrat[grep("^CDD",df1$contrat)] #[1] "CDD détaché ext. Cirad" "CDD détaché ext. Cirad" "CDD détaché ext. Cirad

Re: [R] simplify a dataframe

2013-07-13 Thread jim holtman
Here is how you can do it with the 'data.table' package: > require(data.table) > df1 <- data.table(df1) > result <- df1[ + , list(Debut = Debut[1L] # first entry + , Fin = Fin[1L] + ) + , keyby = c("Matricule", "Nom", "Sexe", "DateNaissance", "contrat", "Pays") + ] > result MatriculeN

Re: [R] simplify a dataframe

2013-07-12 Thread Rui Barradas
Hello, My solution is missing a row, but maybe you can find some inspiration. cols <- c("Matricule", "Nom", "Sexe", "DateNaissance", "contrat", "Pays") irow1 <- duplicated(df1[, cols]) irow2 <- c(FALSE, df1$Debut[-1] == df1$Fin[-nrow(df1)]) df3 <- df1[!irow1 & !irow2, ] dim(df2); dim(df3) #