Re: [R] Data Frame Search Slow

2011-11-22 Thread TimothyDalbey
Answer to my own question: ush <- data.table(read.csv(...)) setkey(ush, product_id) s1 <- ush[J[product.id]] > user system elapsed > 0.000 0.000 0.003 > It seems like that's the method to use! Amazing. -- View this message in context: http://r.789695.n4.nabble.com/Data-Frame-Searc

Re: [R] Data Frame Search Slow

2011-11-22 Thread TimothyDalbey
Update from email outside of this thread: Justin Haynes writes: > matrices will help, but two quick solutions: > > if you are looking for single items to go in the some_value space, use == > instead of %in% and you'll notice speedups. The second more involved > option is to take a look at the

Re: [R] Data Frame Search Slow

2011-11-22 Thread TimothyDalbey
So, here is the result time from using the datatable package: > user system elapsed > 0.800 0.012 1.847 > Here are the methods that I am using: ush <- data.table(read.csv(...)) setkey(ush, product_id) s1 <- subset(ush, product_id == product.id) Seems like a minor improvement but not

Re: [R] Data Frame Search Slow

2011-11-22 Thread TimothyDalbey
Wow, these specs are fantastic: > user system elapsed >0.330.000.39 > I wonder how much of that is because of the capacity of the box that you are running R on. Can you post pertinent specs? This suggest to me that hardware upgrades (RAM specifically) may also be in order. Inv

Re: [R] Data Frame Search Slow

2011-11-22 Thread jim holtman
take a look at using the 'data.table' package. Here are some times to do the lookup using dataframes, matrices and data.tables: data.tables give the answer is less than 0.1 seconds. > str(x.df) 'data.frame': 250 obs. of 4 variables: $ x : Factor w/ 455063 levels "","AAAB",..: 200683

[R] Data Frame Search Slow

2011-11-22 Thread TimothyDalbey
Hey All, So - I promise to write a blog post on this topic and post it somewhere on the internet once I get to the bottom of this. Basically, the set-up to the problem is like this: 1. I have a data frame with dim (2547290, 4) 2. I need to make SQL like lookups on the dataframe. I have been u