On 2011-05-29 23:08, Matthew Keller wrote:
God this listserve is awesome. Thanks to everyone for their ideas.
I'll speed& memory test tomorrow and change the code. Thanks again!
Since you're dealing with a vector of ~ 1e8 elements, you might
find that (at a probably small cost of time) you can reduce the
memory requirements by processing the vector in pieces:
## adjust n to suit trade-off between memory usage and time
n <- 100
k <- length(x) / n
L <- vector("list", n)
for( i in 1:n ) {
y <- x[seq((i - 1) * k + 1, i * k)]
L[[i]] <- gsub("^(.*?)\\..*$","\\1",y, perl=TRUE)
}
newx <- unlist(L)
Peter Ehlers
Matt
On Sun, May 29, 2011 at 6:44 PM, Ian Gow<iand...@gmail.com> wrote:
Not a new approach, but some benchmark data (the perl=TRUE speeds up Jim's
suggestion):
x<- c('18x.6','12x.9','302x.3')
y<- rep(x,100000)
system.time(temp<- unlist(lapply(strsplit(y,".",fixed=TRUE),function(x)
x[1])))
user system elapsed
1.203 0.018 1.222
system.time(temp2<- gsub("^(.*?)\\..*$","\\1",y, perl=TRUE))
user system elapsed
0.176 0.001 0.176
identical(temp2, temp)
[1] TRUE
system.time(temp3<- gsub("^(.*)\\..*", '\\1', y))
user system elapsed
0.292 0.001 0.291
identical(temp3, temp)
[1] TRUE
system.time(temp3<- gsub("^(.*)\\..*", '\\1', y, perl=TRUE))
user system elapsed
0.160 0.001 0.161
On 5/29/11 7:40 PM, "jim holtman"<jholt...@gmail.com> wrote:
Try this approach:
x<- c('18x.6','12x.9','302x.3')
gsub("^(.*)\\..*", '\\1', x)
[1] "18x" "12x" "302x"
On Sun, May 29, 2011 at 8:10 PM, Matthew Keller<mckellerc...@gmail.com>
wrote:
hi all,
I'm full of questions today :). Thanks in advance for your help!
Here's the problem:
x<- c('18x.6','12x.9','302x.3')
I want to get a vector that is c('18x','12x','302x')
This is easily done using this code:
unlist(lapply(strsplit(x,".",fixed=TRUE),function(x) x[1]))
So far so good. The problem is that x is a vector of length 132e6.
When I run the above code, it runs for> 30 minutes, and it takes> 23
Gb RAM (no kidding!).
Does anyone have ideas about how to speed up the code above and (more
importantly) reduce the RAM footprint? I'd prefer not to change the
file on disk using, e.g., awk, but I will do that as a last resort.
Best
Matt
--
Matthew C Keller
Asst. Professor of Psychology
University of Colorado at Boulder
www.matthewckeller.com
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.