And if your data is in a dataframe (... please include an example of the results of str() next time...) :
> dfrm <- rd.txt("Column1, Column2, Column3
+ Yes,Yes,Yes
+ Yes,No,Yes
+ No,No,No
+ No,Yes,No
+ Yes,Yes,No", sep=",") #rd.txt is just a wrapper I use for read.table(textConnection( ), header=TRUE, ... )

> dfrm$newvar <- apply(subset(dfrm, select=c(Column1, Column2, Column3)), 1, + function(x) { if (all(x=="Yes")) {"Yes"} else {"No"} } )
> dfrm
  Column1 Column2 Column3 newvar
1     Yes     Yes     Yes    Yes
2     Yes      No     Yes     No
3      No      No      No     No
4      No     Yes      No     No
5     Yes     Yes      No     No

Notice that I created this variable in a manner that did not require the use of every column of the dataframe.

--
David


On Feb 26, 2010, at 7:57 PM, Don MacQueen wrote:

If your data is in a matrix named "orgdata" :

newvar <- apply(orgdata , 1, function(arow, if (all(arow=='Yes')) 'Yes' else 'No'

Yes, at least 2 missing parens and an unneeded comma, perhaps:

newvar <- apply(orgdata , 1, function(arow) if (all(arow=='Yes')) 'Yes' else 'No' )


newdata <- cbind(orgdata, newvar)

finaloutcome <- newdata[ newvar=='Yes',]


The key to this is the apply() function.

I might have missed some parentheses...

There are other ways; this is just one. I might think of a simpler one if I gave it more time...

-Don

At 4:40 PM -0800 2/26/10, wookie1976 wrote:
I am new to R, but have been using SAS for years. In this transition period, I am finding myself pulling my hair out to do some of the simplest things. An example of this is that I need to generate a new variable based on the outcome of several existing variables in a data row. In other words, if the variable in all three existing columns are "Yes", then then the new variable should also be "Yes", however if any one of the three existing variables is a "No", then then new variable should be a "No". I would then use that new variable as an exclusion for data in a new or existing dataset (i.e., if
NewVariable = "No" then delete):

Take this:
Column1, Column2, Column3
Yes, Yes, Yes
Yes, No, Yes
No, No, No
No, Yes, No
Yes, Yes, No

Generate this:
Column1, Column2, Column3, NewVariable1
Yes, Yes, Yes, Yes
Yes, No, Yes, No
No, No, No, No
No, Yes, No, No
Yes, Yes, No, No

And end up with this:
Column1, Column2, Column3, NewVariable1
Yes, Yes, Yes, Yes

Any suggestions on how to efficiently do this in either the existing or a
new dataset?


You might have simplified this a bit if you let the columns be logical rather than character. > dfrm$newvar <- apply(subset(dfrm, select=c(Column1, Column2, Column3)), 1,
+                         function(x) {  (all(x=="Yes"))  } )
> dfrm
  Column1 Column2 Column3 newvar
1     Yes     Yes     Yes   TRUE
2     Yes      No     Yes  FALSE
3      No      No      No  FALSE
4      No     Yes      No  FALSE
5     Yes     Yes      No  FALSE

You would then be able to apply more simple tests with operators and functions that accept the logical data type.

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to