Re: [R] splitting strings effriciently

2012-01-09 Thread MacQueen, Don
See suggestion inserted below. It assumes and requires that every input IP address has the required four elements. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 1/8/12 5:11 AM, "Enrico Schumann" wrote: > >Hi Andrew,

Re: [R] splitting strings effriciently

2012-01-08 Thread drflxms
Hi Andrew, I am aware, that this is an R-mailing list, but for such tasks (I deal a lot with huge genomic datasets) I tend to use awk and sed for preprocessing of data, in case I run into performance problems. Otherwise for handling of strings in R I recommend stringr library, but I don't know abo

Re: [R] splitting strings effriciently

2012-01-08 Thread Martin Morgan
On 01/08/2012 11:37 AM, jim holtman wrote: Just a quick followup to the previous post using 4M entries: (20 seconds would seem like a reasonable time for the operation) ip<- "123.456.789.321" ## example data df<- data.frame(ip = rep(ip, 4e6), stringsAsFactors=FALSE) system.time(x<- strs

Re: [R] splitting strings effriciently

2012-01-08 Thread jim holtman
Just a quick followup to the previous post using 4M entries: (20 seconds would seem like a reasonable time for the operation) > ip <- "123.456.789.321" ## example data > df <- data.frame(ip = rep(ip, 4e6), stringsAsFactors=FALSE) > system.time(x <- strsplit(df$ip, '\\.')) user system elap

Re: [R] splitting strings effriciently

2012-01-08 Thread Enrico Schumann
Hi Andrew, you can use strsplit for a character vector; you do not have to call it for every element data$ComputerName[i]. If I understand correctly, maybe something like this helps > ip <- "123.456.789.321" ## example data > df <- data.frame(ip = rep(ip, 9), stringsAsFactors=FALSE) > df

[R] splitting strings effriciently

2012-01-08 Thread Andrew Roberts
Folks, I have a data frame with 4861469 rows that contains an ip address xxx.xxx.xxx.xxx as one of the columns. I want to assign a site to each row based on IP ranges. To do this I have a function to split the ip address as character into class A,B,C and D components. It works but is horribly