Hi all, I encountered a problem when trying to read in an Illumina chip annotation file. The offending file is large, so I zipped it up and posted it at
http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/tmp/ProbeInfo_Expression.txt.bz2 Executing this: annot = read.table(bzfile("ProbeInfo_Expression.txt.bz2"), comment.char="", sep = "\t", fill = TRUE, header = TRUE); leads to > dim(annot) [1] 25952 28 i.e. 25952 rows were read, but the file is some 48000 rows long. The file contains long text entries (up to several thousand characters) which appear to be the problem since stripping out those columns (outside of R) and re-reading gives he full 48k+ rows. My question is why is read.table stopping the read (without any warning or error)? Am I missing something in the documentation (read it but didn't find anything). Any arguments I'm not setting right? I tried to google the problem but came up empty-handed. Session info: > sessionInfo() R version 2.11.1 Patched (2010-06-06 r52218) i686-pc-linux-gnu locale: [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8 [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8 [7] LC_PAPER=en_US.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base Thanks, Peter ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.