Hi, I am trying to apply the below logic to generate flag_1 column on a data set consisting of ~1.2 million records in R.
Code : for(i in 1: nrows) { if(A$customer[i]==A$customer[i+1]) { if(is.na(A$Time_Diff[i])) A$flag_1[i] <- 1 else if (A$Time_Diff[i] > 12) A$flag_1[i] <- 1 else A$flag_1[i] <- A$flag_1[i-1]+1 } else { if(is.na(A$Time_Diff[i])) A$flag_1[i] <- 1 else if (A$Time_Diff[i] > 12) A$flag_1[i] <- 1 else A$flag_1[i] <- A$flag_1[i-1]+1 } } Resultant dataset should look like Customer Time_diff flag_1 1 NA 1 1 10 2 1 8 3 1 15 1 1 9 2 1 10 3 2 NA 1 2 2 2 2 5 3 The above logic will take approximately 60 hours to generate the flag_1 column on a dataset consisting of ~1.2 million records. Is there any effective way in R to implement this logic in R ? Appreciate your help. Thanks, Ravi [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.