> > annoTranscripts <- read.table("matched.txt", sep = '\t', stringsAsFactors = > > FALSE) > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : > line 5933 did not have 12 elements > > However, all lines do have 12 columns. > > > lines <- readLines("matched.txt") > ...[many omitted lines]... > The line does not contain comment or quote characters. What can you suggest ?
I suggest looking at the lines preceding the one where the error was found, with both print and cat: print(lines[5933 - (10:0)]) cat(lines[5933 - (10:0)], sep="\n") If things are not obvious after looking at them, see if read.table can read just those lines read.table(text=lines[5933 - (10:0)], sep="\t", stringsAsFactors=FALSE) If it can, try backing up more than 10 lines. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf > Of Dario Strbenac > Sent: Friday, October 04, 2013 5:01 AM > To: r-help@r-project.org > Subject: [R] Tab Separated File Reading Error > > Hello, > > I have a seemingly simple problem that a tab-delimited file can't be read in. > > > annoTranscripts <- read.table("matched.txt", sep = '\t', stringsAsFactors = > > FALSE) > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : > line 5933 did not have 12 elements > > However, all lines do have 12 columns. > > > lines <- readLines("matched.txt") > > tabsPosns <- gregexpr("\t", lines) > > table(sapply(tabsPosns, length)) > > 11 > 367274 > > > system("wc -l matched.txt") > 367274 matched.txt > > You can obtain the file from > https://dl.dropboxusercontent.com/u/37992150/matched.txt > > The line does not contain comment or quote characters. What can you suggest ? > > > sessionInfo() > R version 3.0.1 (2013-05-16) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8 > [5] LC_MONETARY=en_AU.UTF-8 LC_MESSAGES=en_AU.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods > [7] base > > loaded via a namespace (and not attached): > [1] tools_3.0.1 > > -------------------------------------- > Dario Strbenac > PhD Student > University of Sydney > Camperdown NSW 2050 > Australia > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.