Hi,

I am trying to apply the below logic to generate flag_1 column on a data
set consisting of ~1.2 million records in R.

Code :

for(i in 1: nrows)
  {
              if(A$customer[i]==A$customer[i+1])
                {

                  if(is.na(A$Time_Diff[i]))
                     A$flag_1[i] <- 1
                     else if (A$Time_Diff[i] > 12)
                     A$flag_1[i] <- 1
                     else
                     A$flag_1[i] <- A$flag_1[i-1]+1

               }

            else
            {

              if(is.na(A$Time_Diff[i]))
                     A$flag_1[i] <- 1
                     else if (A$Time_Diff[i] > 12)
                     A$flag_1[i] <- 1
                     else
                     A$flag_1[i] <- A$flag_1[i-1]+1

               }
}


Resultant dataset should look like

Customer   Time_diff    flag_1
1                   NA           1
1                   10             2
1                    8              3
1                    15            1
1                    9               2
1                    10              3
2                     NA            1
2                      2               2
2                      5               3

The above logic will take approximately 60 hours to generate the flag_1
column on a dataset consisting of ~1.2 million records. Is there any
effective way in R to implement this logic in R ?

Appreciate your help.

Thanks,
Ravi

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to