> -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of andrewH > Sent: Wednesday, November 14, 2012 2:34 PM > To: r-help@r-project.org > Subject: Re: [R] Getting information encoded in a SAS, SPSS or Stata > command file into R. > > Dear Anthony – > > On closer examination, what I am talking about is not factor levels, but > something different (but analogous). The data that is categorical all has > integer codes, so the file is entirely numeric. The SAS proc format then > gives text strings for each code for each categorical variable. Like this: > > value REGION_f > 11 = "New England Division" > 12 = "Middle Atlantic Division" > 21 = "East North Central Division" > 22 = "West North Central Division" > 31 = "South Atlantic Division" > 32 = "East South Central Division" > 33 = "West South Central Division" > 41 = "Mountain Division" > 42 = "Pacific Division" > 97 = "State not identified" > > So it would make sense to have a lookup table of these codes linked to the > variables. I’m not sure if it makes more sense to have that table live in > R > or in the database. For R purposes, I imagine it would make sense to > convert > these integer-valued variables into factors. > > What I do not understand is how SAS knows where the variables begin and > end. > I managed to break off a little hunk of the beginning of my file and look > at > it in an editor, and it is numbers without any obvious delimiters. Is the > delimiter a particular numeric string? I thought the SAS command file > would > contain the starting location for each of the fixed-length fields, but I > do > not see anything in the file that could be interpreted that way – just a > little wraparound code and then a long list of variable names followed by > triplets of a code, an equals sign, and a text string, terminating with a > semicolon. > > I’m sorry if I am being obtuse. When I said before that I had saved the > SAS > files as flat files, what I really meant was that I had an intern do it. > When I was doing my own analysis, I mainly used TSP, before I switched to > R > about a year ago. I’ve never used SAS. > > I find your data project very interesting. Very. It is not actually > necessary to wait for BLS to release the older CEX files, if you can lay > your hands on the CDs. I spoke to the BLS data products office about 2 > years ago, and they have no problem with people republishing purchased > data > in any format they like, including simple duplication. In fact, they > seemed > to like the idea. I think the sale of data was forced on them by some > kind > of mandate from above. > > I'll be playing with your code (which is a model of readability, and a > lesson to me on same, BTW) and keep you posted on my progress. > > Warmly, Andrew >
Andrew, R-help is not really the venue for discussing SAS programming and how the SAS data step reads fixed width files. If you want to email me (off-list) the SAS program/script for reading the data, I would be willing to explain what it is doing. Dan Daniel Nordlund Bothell, WA USA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.