Re: [R] R eat my data

2010-05-25 Thread Changbin Du
> id_gname<-read.table("/home/cdu/operon/id_name_gh5.txt", sep="\t", quote="", skip=0, header=F, fill=T) > dim(id_gname) [1] 19323 Yes, it works after adding quote="" to the read table options. Thanks, Chris! On Tue, May 25, 2010 at 9:34 AM, Chris Stubben wrote: > > Gene names often hav

Re: [R] R eat my data

2010-05-25 Thread Chris Stubben
Gene names often have single quotes like 5'-methylthioadenosine phosphorylase ATP synthase B' chain ppGpp 3'-pyrophosphohydrolase so maybe try adding quote="" to the read table options. Chris Stubben -- View this message in context: http://r.789695.n4.nabble.com/R-eat-my-data-tp2230217p223

Re: [R] R eat my data

2010-05-25 Thread Joris Meys
the last entries in the dataframe, how do they look? On Tue, May 25, 2010 at 6:12 PM, Changbin Du wrote: > 644727344ABC-2 type transporterABC-2 type transporter > 644727345conserved hypothetical proteinconserved hypothetical > protein > > Here is the last two lines of the file id

Re: [R] R eat my data

2010-05-25 Thread Changbin Du
> your search (the first place I'll look for is the line where R stop reading. > See if any thing strange there.) > > Also, changing "read.table" to "read.delim" often works. > > ...Tao > > > > > > - Original Message > > From

Re: [R] R eat my data

2010-05-25 Thread Changbin Du
gt; ...Tao > > > > > > - Original Message > > From: Changbin Du > > To: David Winsemius > > Cc: r-help@r-project.org > > Sent: Tue, May 25, 2010 9:12:58 AM > > Subject: Re: [R] R eat my data > > > > 644727344ABC-2 type transporte

Re: [R] R eat my data

2010-05-25 Thread Shi, Tao
27;ll look for is the line where R stop reading. See if any thing strange there.) Also, changing "read.table" to "read.delim" often works. ...Tao - Original Message > From: Changbin Du > To: David Winsemius > Cc: r-help@r-project.org > S

Re: [R] R eat my data

2010-05-25 Thread Changbin Du
Thanks you all for the contributions! I will send the data back to the computer guys who collect data yesterday. Actually, the data can be open in excel and txt editor. after replace some ";", > gene_name<-read.table("/home/cdu/operon/id_name_gh5.txt", sep="\t", skip=0, header=FALSE, fill=TRUE) >

Re: [R] R eat my data

2010-05-25 Thread David Winsemius
Have you compared them to tail(gene)_name, 2) Come on, man, show some initiative. On May 25, 2010, at 12:12 PM, Changbin Du wrote: > 644727344ABC-2 type transporterABC-2 type transporter > 644727345conserved hypothetical proteinconserved > hypothetical protein > > Here is the

Re: [R] R eat my data

2010-05-25 Thread Kevin E. Thorpe
When I encounter problems like this, I make sure each row has the expected number of columns. Something like the following awk code is useful. awk -F"\t" '{print NF}' id_name_gh5.txt | sort | uniq -c Note: I'm not sure is the \t will work with the -F switch as above. Kevin Changbin Du wrote

Re: [R] R eat my data

2010-05-25 Thread Joris Meys
without any clue about your data-file this is definitely unsolvable. But some things to consider : Where is the dataset coming from? Did you check for special characters? Is there an apostrophe somewhere in a string? (That messed up things for me once). Is the delimiter placed correctly everywher

Re: [R] R eat my data

2010-05-25 Thread Sarah Goslee
Without the actual file to look at, this is like playing 20 questions, only not so much fun. However, this kind of problem is most often caused by the presence in your file of something that R interprets as a special character, usually # or ' or ". Can you open the file in a spreadsheet? Can you

Re: [R] R eat my data

2010-05-25 Thread Changbin Du
c...@nuuk:~/operon$ grep '^#' id_name_gh5.txt c...@nuuk:~/operon$ no lines starts with # On Tue, May 25, 2010 at 9:11 AM, Barry Rowlingson < b.rowling...@lancaster.ac.uk> wrote: > On Tue, May 25, 2010 at 4:42 PM, Changbin Du wrote: > > HI, Dear R community, > > > > My original file has 1932 l

Re: [R] R eat my data

2010-05-25 Thread Changbin Du
644727344ABC-2 type transporterABC-2 type transporter 644727345conserved hypothetical proteinconserved hypothetical protein Here is the last two lines of the file id_name_gh5.txt. On Tue, May 25, 2010 at 8:57 AM, David Winsemius wrote: > > On May 25, 2010, at 11:42 AM, Changbin

Re: [R] R eat my data

2010-05-25 Thread Mohamed Lajnef
cheks the comments sent by David! M Changbin Du a écrit : > length(count.fields("/home/cdu/operon/id_name_gh5.txt")) [1] 1932 It is 1932 lines when count in R On Tue, May 25, 2010 at 8:52 AM, Mohamed Lajnef mailto:mohamed.laj...@inserm.fr>> wrote: Hi Changbin, Try to use this

Re: [R] R eat my data

2010-05-25 Thread Barry Rowlingson
On Tue, May 25, 2010 at 4:42 PM, Changbin Du wrote: > HI, Dear R community, > > My original file has 1932 lines, but when I read into R, it changed to 1068 > lines, how comes? > > > c...@nuuk:~/operon$ wc -l id_name_gh5.txt > 1932 id_name_gh5.txt > > >> gene_name<-read.table("/home/cdu/operon/id_n

Re: [R] R eat my data

2010-05-25 Thread Changbin Du
> length(count.fields("/home/cdu/operon/id_name_gh5.txt")) [1] 1932 It is 1932 lines when count in R On Tue, May 25, 2010 at 8:52 AM, Mohamed Lajnef wrote: > Hi Changbin, > > Try to use this code in R to count the lines of your file without open it > > length(count.fields("id_name_gh5.txt"))

Re: [R] R eat my data

2010-05-25 Thread David Winsemius
On May 25, 2010, at 11:42 AM, Changbin Du wrote: HI, Dear R community, My original file has 1932 lines, but when I read into R, it changed to 1068 lines, how comes? We are being asked to investigate this quest, how? Have you looked at the last line to see if it looks like gene_name? Isn

Re: [R] R eat my data

2010-05-25 Thread Mohamed Lajnef
Hi Changbin, Try to use this code in R to count the lines of your file without open it length(count.fields("id_name_gh5.txt")) Regards Mohamed Changbin Du a écrit : HI, Dear R community, My original file has 1932 lines, but when I read into R, it changed to 1068 lines, how comes? c

[R] R eat my data

2010-05-25 Thread Changbin Du
HI, Dear R community, My original file has 1932 lines, but when I read into R, it changed to 1068 lines, how comes? c...@nuuk:~/operon$ wc -l id_name_gh5.txt 1932 id_name_gh5.txt > gene_name<-read.table("/home/cdu/operon/id_name_gh5.txt", sep="\t", skip=0, header=F, fill=T) > dim(gene_name) [1