David's ?measurement measurement(mz$age_variable) <- "interval" # where age_variable is the unstated item in that "select" list is what I use in similar circumstances.
Where it seems to come from is the SPSS users habit of setting value labels on various categories of user-missing values - so a survey will commonly have no actual missings (in spss, system-missing) but -9 as 'refused', -8 as 'not contactable' and so on. The importer picks up the existence of value labels and sets the mode as "nominal" - which gets transformed into factor in R usage - using base read.spss would be likely to read these in as factor. For analysis purposes, these values would be likely to be NA, but it may be important to record that you were making that change. On 26 July 2012 01:35, David Winsemius <dwinsem...@comcast.net> wrote: > > On Jul 25, 2012, at 9:48 AM, Marion Wenty wrote: > >> Dear list members, >> >> I have got another problem. I imported an SPSS file with the Memisc >> package >> using the following commands: >> >> mz <- spss.system.file("myspssfile.sav") >> >> mz <- subset(mz,select=c( >> bsex,balt,xurb,dtaet,kartab,bgeb,boseit,bgeblan,xnuts2,kausb,xerwstat, >> asbper,asbhh,ajahr,aquartal,bstaat,xwieoft,gew1,apkz,bpkzm,bpkzv)) > > > The memisc package help file for spss.system.file() (actually labeled > "importers") says that there is an S4 method for "subset" there does not > seem to be a separate page describing its behavior or values > > >> >> Afterwards I checked the measurements of the variables > > > What does that phrase mean? What code did you actually use? > > >> and they are all >> right for most of them (e.g. the variable containing the sex of a person >> is >> "nominal" and the variable containing the year is "interval"). For two of >> the variables the measurement is not o.k, though. They exclusively contain >> numbers in the SPSS file (e.g. the age of a person) - and no NAs - but >> have >> got the measurement "nominal"! > > > Hard to say. There is no R storage mode that is called "nominal". Perhaps > some sort of memisc-specific terminology? Or perhaps something in your spss > dataset... to which you have not provided access? Or the default setting for > the spss.system.file access method? I know that the Hmisc package's > describe() function will report out a variable as though it were categorical > if there are only 8 or fewer unique values. > > After looking at further help pages and trying the help pages code, I am > guessing that some of my puzzlement might be answered by reading the > vignettes, but you can do that yourself. > > Here's a hackish guess ... try: > > ?measurement > measurement(mz$age_variable) <- "interval" > # where age_variable is the unstated item in that "select" list > > -- > > David Winsemius, MD > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.