Thanks for everyoneâs feedback. It may be that the person who is using SQL Server 2005 is doing something that is not helpful.
In the data there are supposed to be 10 columns with headers. The first column is a date time that I was reading in as a string. The second is a string. The rest are numeric. While I am focusing on one file I will have to deal with this issue on an ongoing basis. There are no accented characters in the dataset. There should not be any problematic characters in the dataset. When I go into Excel and save as comma separated and then do text to columns with comma as the separator and then format the first column as date (m/d/Y) I can import into R and I am on my way. When I look at the data in Notepad, it looks like an ordinary comma separated file. When I open the file in Excel 2007, I notice that each rowâs data is only in column A. When I do nothing with the file and just try to read the file into R, the following happens. > test06 = read.table("test2006.csv", sep = ",", header = TRUE, as.is = TRUE) > test06[1:6, 1:2] Error in `[.data.frame`(test06, 1:6, 1:2) : undefined columns selected > test06[1:6] Error in `[.data.frame`(test06, 1:6) : undefined columns selected > head(test06) ÿþD 1NA 2NA 3NA 4NA 5NA 6NA This alternative produced the same results. test06 = read.table("test2006.csv", sep = ",", header = TRUE, stringsAsFactors = FALSE) Using a dataset based on a SQL Server 2008 book but using SQL Server 2012, I could NOT duplicate the problem. I used two strategies. I copied the output and then did file save as comma separated and got everything lined up nicely, but no headers. If I pasted into Excel, then I got the headers. But otherwise I was able to read everything into R. Interestingly different things happen depending on whether I try to open from Windows 7 Explorer or from within Excel. There are no Excel issues if I open with Explorer. If I try to open with Excel 2007 I get different behavior depending on how I saved in SQL Server 2012. If I did select all; copy, save as csv in SQL, then when trying to open in Excel I get the data import wizard for a DELIMITED file. This issue does not arise when opening from Explorer. If I did select all; copy in SQL and then pasted into Excel and saved as csv, there are no problems whether I open in Explorer or Excel. No import wizard. When I try to open the data file that I need in Explorer, the file opens but everything is in the first column. When I try to open the data file that I need using Excel 2007, the import wizard opens and starts off as FIXED WIDTH. As I began to have a better understanding of the issues, I thought back to what my colleague wrote when I stated the problem to the colleague. I think the key may be instruction 2. I did a Find from within Notepad and also from within Excel and could not find any quote marks. The colleague's advice. Instruction 3 relates to the fact that the data values in column 2 are padded on the right with spaces. These files are created by SQL on a server.Hereâs an easy way to read the files. 1) Open up each file in Notepad, replace all â with a space, then save 2) Change the file type to â.txtâ 3) Open each file in excel using a comma as a delimitator. [You might change column B to text format or use âfind and replaceâ to delete the spaces in the StockID.] On 10/9/2013 2:37 AM, Milan Bouchet-Valat wrote: > Le mardi 08 octobre 2013 à 16:02 -0700, Ira Sharenow a écrit : >> A colleague is sending me quite a few files that have been saved with MS >> SQL Server 2005. I am using R 2.15.1 on Windows 7. >> >> I am trying to read in the files using standard techniques. Although the >> file has a csv extension when I go to Excel or WordPad and do SAVE AS I >> see that it is Unicode Text. Notepad indicates that the encoding is >> Unicode. Right now I have to do a few things from within Excel (such as >> Text to Columns) and eventually save as a true csv file before I can >> read it into R and then use it. >> >> Is there an easy way to solve this from within R? I am also open to easy >> SQL Server 2005 solutions. >> >> I tried the following from within R. >> >> testDF = read.table("Info06.csv", header = TRUE, sep = ",") >> >>> testDF2 = iconv(x = testDF, from = "Unicode", to = "") >> Error in iconv(x = testDF, from = "Unicode", to = "") : >> >> unsupported conversion from 'Unicode' to '' in codepage 1252 >> >> # The next line did not produce an error message >> >>> testDF3 = iconv(x = testDF, from = "UTF-8" , to = "") >>> testDF3[1:6, 1:3] >> Error in testDF3[1:6, 1:3] : incorrect number of dimensions >> >> # The next line did not produce an error message >> >>> testDF4 = iconv(x = testDF, from = "macroman" , to = "") >>> testDF4[1:6, 1:3] >> Error in testDF4[1:6, 1:3] : incorrect number of dimensions >> >>> Encoding(testDF3) >> [1] "unknown" >> >>> Encoding(testDF4) >> [1] "unknown" >> >> This is the first few lines from WordPad >> >> Date,StockID,Price,MktCap,ADV,SectorID,Days,A1,std1,std2 >> >> 2006-01-03 >> 00:00:00.000,@Stock1,2.53,467108197.38,567381.144444444,4,133.14486997089,-0.0162107939626307,0.0346283580367959,0.0126471695454834 >> >> 2006-01-03 >> 00:00:00.000,@Stock2,1.3275,829803070.531114,6134778.93292,5,124.632223896458,0.071513138376339,0.0410694546850102,0.0172091268025929 > What's the actual problem? You did not state any. Do you get accentuated > characters that are not printed correctly after importing the file? In > the two lines above it does not look like there would be any non-ASCII > characters in this file, so encoding would not matter. > > > Regards > [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.