Dear Jim,

2013/8/5 jim holtman <jholt...@gmail.com>:
> Couple of things to try.  May have an extra quote, so put:
>
> quote = ''

thank you very much. That did the trick.

Much obliged!

>
> as one of the parameters.  Also, might have comments, so try:
>
> comment.char = ""
>
> Take alook at your file and determine what line was the last complete one
> and see if there might be a problem in that line, or preceeding ones.
>
> On Mon, Aug 5, 2013 at 7:11 AM, Asis Hallab <asis.hal...@gmail.com> wrote:
>>
>> Dear R experts,
>>
>> I have a large table saved in a file called "plant_genome.gff". The
>> file has 481848 lines in nine columns, which are TAB delimited, and is
>> 53 MegaBytes large.
>> For anyone who might know the GFF3 format: The table holds a plant
>> genome's annotation.
>>
>> If I read in the table with
>> read.table( "plant_genome.gff" )
>> I get the following error
>> "line 2 did not have 12 elements".
>>
>> If I read in the table with
>> read.table( "plant_genome.gff", sep="\t" )
>> no error or warning is given, but my resulting table has only 193547
>> instead of the expected 481848 rows! 60% of the lines are omitted.
>>
>> Also passing in the arguments
>> as.is = TRUE
>> or setting the columns' classes with
>> colClasses = c( "character", …, "integer", "integer", "numeric",
>> "character", … )
>>    # columns 4, and 5 are integers, column 6 is numeric, all others
>> are characters
>> does not resolve the problem.
>>
>> If I read in the file with readLines and then manually split them using
>> strplit(…)
>> and combine them into a data.frame with
>> as.data.frame( do.call( "rbind", splitted.lines ), colClasses=…)
>> I get the expected and correct data.frame, representing my GFF3 data.
>>
>> My questions are:
>> 1) Am I using read.table wrong, or did I miss something in the
>> documentation?
>> 2) Or is this is known problem with large TAB delimited tables, whose
>> columns contain white-spaces and are not surrounded by quotes?
>>
>> Unfortunately due to the unpublished nature of the plant genome I am
>> not allowed to give access to the GFF table that causes this problem.
>>
>> Any ideas, hints, help - or comments on my stupidity having missed
>> something important - will be much appreciated!
>>
>> Cheers!
>>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to