Greetings all.

I'm starting analysis in R on a reasonably sized pre-existing dataset, of 583 
variables and 1127 observations. This was an SPSS datafile, which I read in 
using the read.spss command using the foreign package, and the data was 
assigned to a data.frame when it was read in. The defaults in read.spss were 
used, except I set to.data.frame = TRUE.

The data is a survey dataset (each observation/case = one participant), and 
many of the variables are participants' responses to Likert scale items. These 
have been coded on a 1 to 7 scale, with "8" used to code "Don't know" 
responses. The assumption is that the 1-7 responses are at least interval 
level, however the response "8" is clearly not. For many analyses, this doesn't 
matter because I'm only doing chi-square tests. However, for a between-group 
comparison crosstab I would like to exclude those who gave "8" responses 
because I am only interesting in testing differences for the participants who 
gave responses measured on the Likert scale proper.

I have encountered problems when I need to exclude the observations from 
analysis, where they gave an "8" response to either of two questions (Question 
1A and Question 1B), which relate to columns 72 and 73 of the dataframe. The 
chi-square I am trying to do is based on two other variables (mean of Q1A+Q1B 
for each participant) and a grouping variable, which are contained in columns 8 
and 80 of the dataframe, respectively. The reason I am excluding anyone who 
gave an "8" ("Don't know) response on questions 1A and 1B is that their mean on 
these two questions cannot be interpreted as the value "8" is nominal rather 
than interval/ratio and therefore cannot be used in a mathematical expression.

I've been trying to use an if-or combination, and I can't get it to work. The 
chi-square test without the attempt to subset using "if" is working fine, I 
don't understand what I am doing wrong in my attempts to subset.

I have tried to reference the variables like this:
> if ("Q1A"!=8 | "Q1B"!=8)
+ (table(micronutrients[,8,80]))
<group counts snipped>
> chisq.test(table(micronutrients[,8,80]))

The group counts returned from the table statement show me that no observations 
are being excluded from the analysis. The chisq.test works fine on 
(table(micronutrients[,8,80])) but, of course, it is being performed on the 
entire dataset as I have been unsuccessful in subsetting the data.

I tried to see if the column names were objects and I got these errors:
> object("Q1A")
Error: could not find function "object"
> Q1A
Error: object 'Q1A' not found
I'm not sure if this is important.

So I tried to do the if-or using the column number, but that didn't work either:
> if (micronutrients[,72]!=8 | micronutrients[,73]!=8)
+ (table(micronutrients[,8,80]))
<group counts snipped>
Warning message:
In if (micronutrients[, 72] != 8 | micronutrients[, 73] != 8) 
(table(micronutrients[,  :
  the condition has length > 1 and only the first element will be used

I got exactly the same chi-square output as in my previous attempt.

If any of you know SPSS, what I am trying to do in R is equivalent to: 
temporary. select if not (Q1A=8 or Q1B=8). In SAS, it would be the same as a 
subsetting if that lasted only for the particular analysis, or a where, e.g. 
proc tabulate; where Q1A ne 8 or Q1B ne 8;

How can I subset the data? I would prefer not to create another variable to 
hold the recodes as the dataset is already complex.

I only wish the subsetting condition to hold for the test immediately following 
the instruction to subset (I need to subset the data in different ways for 
different question combinations). Because the instruction is complete once the 
table() command is issued, I am assuming that the if statement only relates to 
the table() command and therefore only indirectly to the chisq.test() command 
following (as this is being performed on the subsetted table) - which is 
exactly what I want.

Cheers
Michelle

**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.

www.clearswift.com
**********************************************************************



        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to