Re: [R] Read.table problems

Marc Schwartz Mon, 18 May 2009 10:00:35 -0700

On May 18, 2009, at 11:24 AM, Steve Murray wrote:

Dear all,
I have a file which I've converted from NetCDF (.nc) to text (.txt)using ncdump in Unix (as I had problems using the ncdf package to dothis). The first few rows (as copied and pasted from the Unixconsole) of the file appear as follows:
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,_, _,_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,_, _,_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,_, _,_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,_, _,_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,_, _,_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,_, _,_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,_, _,_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,_, _,_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,_, _,_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,_, _,
As you can see, there are a lot of NA values before the actualnumeric values start further down the dataset. My problem is thatI'm having trouble reading this file into R. I think the problemlies with the sep= argument, although I may be wrong. I tried thefollowing command at first, as the data appear to be comma separated:
read.table("test86.txt", skip=43, na.strings="-", header=FALSE,sep=",") -> test86 # skip =43 due to meta-data information beingheld in the initial rows
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines,na.strings, :
 line 29 did not have 25 elements
I then tried sep=" ", followed by sep="" but received a similar-typeerror message (although line 29 doesn't appear to be especiallydifferent from the rest).
I subsequently tried using sep=\t and then sep=\n. These both resultin the data being read in without an error message being displayed,although the data are formatted as follows:
head(test86)
                                                                           V1
1 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,_, _, _,2 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,_, _, _,3 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,_, _, _,4 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,_, _, _,5 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,_, _, _,6 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,_, _, _,
dim(test86)
[1] 179899      1


Instead of one column, I'd expect there to be 720.
I think I'm getting something wrong relating to the sep= argument(or possibly mis-using na.strings?). If anyone has any solutions tothis then I'd be very grateful to hear them.
Many thanks for any advice,

Steve



Two problems,

1. Your first line above has one more column/entry than the subsequentlines. If that is correct, you need to use the 'fill = TRUE' argumentso that all subsequent rows are filled to have the same number ofcolumns. If the above is due to a copy/paste error, then disregard this.

2. You are using a '-' (hyphen) as your 'na.strings' character, whenthe data is using a '_' (underscore).

Additionally, I would use 'strip.white = TRUE', to aid in getting ridof extraneous white space around your fields/separators. That willalso help with column separations.



Thus (on OSX) with the above data copied to the clipboard:

> read.table(pipe("pbpaste"), na.strings = "_", sep = ",", fill =TRUE, strip.white = TRUE)V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19V20 V21 V22 V23 V24 V25 V261 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NANA NA NA NA NA NA NA2 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NANA NA NA NA NA NA NA3 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NANA NA NA NA NA NA NA4 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NANA NA NA NA NA NA NA5 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NANA NA NA NA NA NA NA6 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NANA NA NA NA NA NA NA7 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NANA NA NA NA NA NA NA8 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NANA NA NA NA NA NA NA9 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NANA NA NA NA NA NA NA10 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NANA NA NA NA NA NA NA




HTH,

Marc Schwartz

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Read.table problems

Reply via email to