Hi fellas, I am working on a dataframe cam and it involves comparison within the 2 columns - t1 and t2 on about 20K rows and 14 columns.
### cap = cam; # this doesn't take long. ~1 secs. for( i in 1:length(cam$end_date)) { x1=strptime(cam$end_date[i], "%d/%m/%Y"); x2=strptime(cam$end_date[i+1], "%d/%m/%Y"); t1= cam$vol[i]; t2= cam$vol[i+1]; if(!is.na(x2) && !is.na(x1) && !is.na(t1) && !is.na(t2)) { if( (x2>=x1) && (t1==t2) ) # date and vol { cap$levels[i]=1; #make change to specific dataframe cell cap$levels[i+1]=1; } } } ### Having coded that, i ran a timing profile on this section and each 1000'th row comparison is taking ~1.1 minutes on a 2.8Ghz dual-core box (which is a test box we use). This obviously computes to ~21 minutes for 20k which is definitely not where we want it headed. I believe, optimisation(or even different way to address indexing inside dataframe) can be had inside the innermost `if' and specifically in `cap$levels[i]=1;' but I am a bit at a loss having scoured the documentation failing to find anything of value. So, my question remains are there any general/specific changes I can do to speed up the code execution dramatically? Thanks folks. -- Regards, Ishwor Gurung ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.