Hi Gabor, I replaced multiple spaces with a single one and tried the code you suggested. I got:
> library(sqldf) Loading required package: RSQLite Loading required package: DBI Loading required package: gsubfn Loading required package: proto > source("http://sqldf.googlecode.com/svn/trunk/R/sqldf.R") > myfile <- file("243_47mel_withnormal_expression_log2.txt") > stmt <- read.table(myfile, nr = 1, as.is = TRUE) > stmt <- stmt[regexpr("call", stmt) < 0] > stmt <- paste("select", paste(stmt, collapse = ","), "from myfile") > myfile <- file("243_47mel_withnormal_expression_log2.txt") > DF <- sqldf(stmt, file.format = list(sep = " ")) Error in try({ : RS-DBI driver: (RS_sqlite_import: ./243_47mel_withnormal_expression_log2.txt line 6651 expected 488 columns of data but found 641) In addition: Warning message: closing unused connection 3 (243_47mel_withnormal_expression_log2.txt) Error in sqliteExecStatement(con, statement, bind.data) : RS-DBI driver: (error in statement: unrecognized token: "2SignalA") > What can you suggest? Sth wrong with the input file you can think of? Thanks! Allen On Nov 10, 2007 10:37 AM, Gabor Grothendieck <[EMAIL PROTECTED]> wrote: > Thanks. > > > On Nov 10, 2007 10:29 AM, affy snp <[EMAIL PROTECTED]> wrote: > > Gabor, > > > > I will do it either later today or tomorrow. Promised. > > > > Allen > > > > > > On Nov 10, 2007 10:23 AM, Gabor Grothendieck <[EMAIL PROTECTED]> wrote: > > > Please try out the sqldf solution as well and let me know > > > how it compares since I have never tried anything > > > this large and would be interested to know. > > > > > > > > > On Nov 10, 2007 9:27 AM, affy snp <[EMAIL PROTECTED]> wrote: > > > > Thanks all for the help and suggestions. By specifying the colClass in > > > > read.table() > > > > and running it on a server with 8Gb memory, I could have the data read > > > > in 2 mins. > > > > I will just skip sqldf method for now and get back in a moment. > > > > > > > > Best, > > > > Allen > > > > > > > > > > > > On Nov 10, 2007 2:42 AM, Prof Brian Ripley <[EMAIL PROTECTED]> wrote: > > > > > Did you read the Note on the help page for read.table, or the 'R Data > > > > > Import/Export Manual'? There are several hints there, some of which > > > > > will > > > > > be crucial to doing this reasonably fast. > > > > > > > > > > How big is your computer? That is 116 million items (you haven't > > > > > told us > > > > > what type they are), so you will need GBs of RAM, and preferably a > > > > > 64-bit > > > > > OS. Otherwise you would be better off using a DBMS to store the data > > > > > (see > > > > > the Manual mentioned in my first para). > > > > > > > > > > > > > > > On Fri, 9 Nov 2007, affy snp wrote: > > > > > > > > > > > Dear list, > > > > > > > > > > > > I need to read in a big table with 487 columns and 238,305 rows > > > > > > (row names > > > > > > and column names are supplied). Is there a code to read in the > > > > > > table in > > > > > > a fast way? I tried the read.table() but it seems that it takes > > > > > > forever :( > > > > > > > > > > > > Thanks a lot! > > > > > > > > > > > > Best, > > > > > > Allen > > > > > > > > > > -- > > > > > Brian D. Ripley, [EMAIL PROTECTED] > > > > > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > > > > > University of Oxford, Tel: +44 1865 272861 (self) > > > > > 1 South Parks Road, +44 1865 272866 (PA) > > > > > Oxford OX1 3TG, UK Fax: +44 1865 272595 > > > > > > > > > > > > > > > > ______________________________________________ > > > > R-help@r-project.org mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide > > > > http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.