Hi Andrew, I am aware, that this is an R-mailing list, but for such tasks (I deal a lot with huge genomic datasets) I tend to use awk and sed for preprocessing of data, in case I run into performance problems. Otherwise for handling of strings in R I recommend stringr library, but I don't know about it's performance...
Felix > Folks, > > I have a data frame with 4861469 rows that contains an ip address > xxx.xxx.xxx.xxx as one of the columns. I want to assign a site to each > row based on IP ranges. To do this I have a function to split the ip > address as character into class A,B,C and D components. It works but is > horribly inefficient in terms of speed. I can't quite see how one of the > l/s/m/t/apply functions could be brought to bear on the problem. Does > anyone have any thoughts? > > for(i in 1:4861469) > { > lst <-unlist(strsplit(data$ComputerName[i], "\\.")) > data$IPA[i] <-lst[[1]] > data$IPB[i] <-lst[[2]] > data$IPC[i] <-lst[[3]] > data$IPD[i] <-lst[[4]] > rm(lst) > } > > Andrew > > Andrew Roberts > Children's Orthopaedic Surgeon > RJAH, Oswestry, UK ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.