Re: [R] How long does skipping in read.table take

2010-10-23 Thread Gabor Grothendieck
On Sat, Oct 23, 2010 at 10:52 AM, Dimitri Liakhovitski wrote: > Just tried it on my work computer (Windows XP, I only have 2 GB RAM): > I've run your code, just indicated the separator "|" in read.table (in > DF line) and added the actual processing (writing out of the result > with a file name) -

Re: [R] How long does skipping in read.table take

2010-10-23 Thread Dimitri Liakhovitski
Also am running the same code on my powerful home PC. It's been running for 25 minutes already, and still has not printed the first end time (does it mean it's still trying to read in DF for the first time)? On Sat, Oct 23, 2010 at 10:52 AM, Dimitri Liakhovitski wrote: > Just tried it on my work

Re: [R] How long does skipping in read.table take

2010-10-23 Thread Dimitri Liakhovitski
Just tried it on my work computer (Windows XP, I only have 2 GB RAM): I've run your code, just indicated the separator "|" in read.table (in DF line) and added the actual processing (writing out of the result with a file name) - see below. I got: Error in textConnection(x) : cannot allocate memory

Re: [R] How long does skipping in read.table take

2010-10-23 Thread Dimitri Liakhovitski
Gabor, thanks a lot. So, I don't really need sql? That's great. I'll try your code. To finish with sql, I've run this: (I wanted to skip the first 11 million rows) mydata<-read.csv.sql("my.file.txt", sep="|", eol="\r\n", sql = "select * from file limit 100, 1099") After 20 min (on a 4-co

Re: [R] How long does skipping in read.table take

2010-10-23 Thread Gabor Grothendieck
On Sat, Oct 23, 2010 at 10:07 AM, Dimitri Liakhovitski wrote: > I just tried it: > > for(i in 11:16){ #i<-11 >  start<-Sys.time() >  print(start) >  flush.console() >  filename<-paste("skipped millions- ",i,".txt",sep="") >  mydata<-read.csv.sql("myfilel.txt", sep="|", eol="\r\n", sql = > "select

Re: [R] How long does skipping in read.table take

2010-10-23 Thread Dimitri Liakhovitski
O, wait a sec - does it mean I can't feed my objects into sql commands? On Sat, Oct 23, 2010 at 10:07 AM, Dimitri Liakhovitski wrote: > I just tried it: > > for(i in 11:16){ #i<-11 >  start<-Sys.time() >  print(start) >  flush.console() >  filename<-paste("skipped millions- ",i,".txt",sep="") >  

Re: [R] How long does skipping in read.table take

2010-10-23 Thread Dimitri Liakhovitski
I just tried it: for(i in 11:16){ #i<-11 start<-Sys.time() print(start) flush.console() filename<-paste("skipped millions- ",i,".txt",sep="") mydata<-read.csv.sql("myfilel.txt", sep="|", eol="\r\n", sql = "select * from file limit 100, (100*i-1)") write.table(mydata,file=filename,sep

Re: [R] How long does skipping in read.table take

2010-10-23 Thread Dimitri Liakhovitski
Oh, I understand - I did not realize it's reading in the whole file. So, is there any way to make it read it in only once and the spit into R just one piece (e.g., 1 million rows), write a regular file out (e.g., a txt using write.table), and then grab the next million? Because I was planning to do

Re: [R] How long does skipping in read.table take

2010-10-23 Thread Gabor Grothendieck
On Sat, Oct 23, 2010 at 9:20 AM, Dimitri Liakhovitski wrote: > This is very helpful, Gabor. > I've run the code to figure out the end of the line and here is what I > am seeing at the end of each line: \r\n > So, I specified like this: > mydata<-read.csv.sql("myfile.txt", sep="|", eol="\r\n", sql

Re: [R] How long does skipping in read.table take

2010-10-23 Thread Dimitri Liakhovitski
This is very helpful, Gabor. I've run the code to figure out the end of the line and here is what I am seeing at the end of each line: \r\n So, I specified like this: mydata<-read.csv.sql("myfile.txt", sep="|", eol="\r\n", sql = "select * from file limit 200, 100") However, again it's hanging agai

Re: [R] How long does skipping in read.table take

2010-10-23 Thread Gabor Grothendieck
On Sat, Oct 23, 2010 at 7:44 AM, Dimitri Liakhovitski wrote: > Gabor, > maybe some of my code is wrong (I don't know sql at all). I tried the > following with just a few lines as a test: > library(sqldf) > mydata<-read.csv.sql("myfile.txt",sep="|", sql = "select * from file 200, > 100") "limit"

Re: [R] How long does skipping in read.table take

2010-10-23 Thread Dimitri Liakhovitski
Gabor, maybe some of my code is wrong (I don't know sql at all). I tried the following with just a few lines as a test: library(sqldf) mydata<-read.csv.sql("myfile.txt",sep="|", sql = "select * from file 200, 100") But it's just hanging. The same happened when I wrote: mydata<-read.csv.sql("myfile.

Re: [R] How long does skipping in read.table take

2010-10-22 Thread Gabor Grothendieck
On Fri, Oct 22, 2010 at 6:41 PM, Mike Marchywka wrote: >> From: ggrothendi...@gmail.com >> Date: Fri, 22 Oct 2010 18:28:14 -0400 >> To: dimitri.liakhovit...@gmail.com >> CC: r-help@r-project.org >> Subject: Re: [R] How long does skipping in read.table take >>

Re: [R] How long does skipping in read.table take

2010-10-22 Thread Gabor Grothendieck
On Fri, Oct 22, 2010 at 9:45 PM, Dimitri Liakhovitski wrote: > Gabor, > thanks a lot - sqldf might be a solution. However, do you know if > sqldf can also read in .txt files (with different delimiters)? > The data I am dealing with is "|" - delimited. So, I was using > read.table(...,sep="|") > I

Re: [R] How long does skipping in read.table take

2010-10-22 Thread Dimitri Liakhovitski
Gabor, thanks a lot - sqldf might be a solution. However, do you know if sqldf can also read in .txt files (with different delimiters)? The data I am dealing with is "|" - delimited. So, I was using read.table(...,sep="|") I looked at sqldf description - but did not see examples with .txt. Thanks

Re: [R] How long does skipping in read.table take

2010-10-22 Thread Mike Marchywka
> From: ggrothendi...@gmail.com > Date: Fri, 22 Oct 2010 18:28:14 -0400 > To: dimitri.liakhovit...@gmail.com > CC: r-help@r-project.org > Subject: Re: [R] How long does skipping in read.table take > > On Fri, Oct 22, 201

Re: [R] How long does skipping in read.table take

2010-10-22 Thread Gabor Grothendieck
On Fri, Oct 22, 2010 at 5:17 PM, Dimitri Liakhovitski wrote: > I know I could figure it out empirically - but maybe based on your > experience you can tell me if it's doable in a reasonable amount of > time: > I have a table (in .txt) with a 17,000,000 rows (and 30 columns). > I can't read it all

Re: [R] How long does skipping in read.table take

2010-10-22 Thread Mike Marchywka
> Date: Fri, 22 Oct 2010 17:17:58 -0400 > From: dimitri.liakhovit...@gmail.com > To: r-help@r-project.org > Subject: [R] How long does skipping in read.table take > > I know I could figure it out empirically - but maybe based on your > experience you can tell me if it

[R] How long does skipping in read.table take

2010-10-22 Thread Dimitri Liakhovitski
I know I could figure it out empirically - but maybe based on your experience you can tell me if it's doable in a reasonable amount of time: I have a table (in .txt) with a 17,000,000 rows (and 30 columns). I can't read it all in (there are many strings). So I thought I could read it in in parts (e