On 21-Oct-10 23:23:03, mkinseth wrote: > I'm new to R. I have a mac (OS10.6). I have converted an Excel > file to a csv to import into R. I have used many methods to > import the file, most do not work, the best so far is: > > filename <- read.csv(/Users/Desktop/csvfile.csv", > header=T, sep=","). > I have also tried taking out the header and sep lines and it > still imports fine. > > Regardless of what I do, it always brings in the data but also > includes all 16,000+ columns from Excel with X.1 as the first > column up to X.16345 (for instance). Of course the files are > filled with "NA" since no data is present. > > How do I get around this? > > So far I have included strip.white=T and fill=T, something I've > seen in other csv import posts but this does not work. > > Thanks!
"strip.white=T" is not going to solve the problem, since this only changes the values of character fields which are present (by removing leading and trailing blanks). It will not remove blank fields. Nor will "fill=T": if there is, anywhere in the CSV file, a row with more commas in it than the others, then *all* the rows will be extended so as to have the same number of fields as the excessively long one. See the result of entering ?read.csv for more detail. Since, you say, you have converted the Excel file to CSV yourself, the solution is probably in your own hands. Excel is notorious for messing up data structures in ways which are not readily perceptible to the user. One way in which this can happen is that if, during data entry, the cursor wanders beyond the intended limits of the data set (e.g. to a row below the number of rows in the data, or to a column to the right of the number of columns in the data) then, without the user entering anything whatever into the cell where the cursor happens to be, Excel may register that cell as having content. The user is unlikely to be aware that this has happened, but once it has happened the CSV file will contain a rectanguler block which includes this "outlying" cell, hence too many rows and/or too many columns. One way to try to overcome this, in Excel, is to select the area of the table to be exported prior to initiating the export to CSV. I won't advise on techniques for that, since I haven't used an Excel version later than Win98 ... ! (and I have never happy that I have done the right thing in any Excel operation). I hope others will give detailed advice. Then you should find that the CSV contains only as many rows and columns as you have in your data and you should be OK. It may simply be that, by default, your Excel has saved the whole spreadsheet, so you get all the columns. The same solution should apply. Hoping that this help[s. Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <[email protected]> Fax-to-email: +44 (0)870 094 0861 Date: 22-Oct-10 Time: 11:14:19 ------------------------------ XFMail ------------------------------ ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

