Re: [R] Creating a new column from a series of columns

Jeff Newmiller Fri, 31 Oct 2014 20:16:26 -0700

This method handles cases where multiple columns are "Yes".


library(reshape2)
ddl <- melt( dd, id.vars = "PLTID" )
ddl[ is.na( ddl$value ), "value" ] <- ""
ddl <- ddl[ "Yes" == ddl$value, ]
result <- merge( dd[ , "PLTID", drop=FALSE ]
               , ddl[ , c( "PLTID", "variable", "value" ) ]
                    , all.x=TRUE
               )

On Fri, 31 Oct 2014, Fisher Dennis wrote:

R 3.1.1
OS X

Colleagues,
I have a dataset containing multiple columns indicating race for subjects in a 
clinical trial.  A subset of the data (obtained with dput) is shown here:

structure(list(PLTID = c(7157, 8138, 8150, 9112, 9114, 9115,
9124, 9133, 9141, 9144, 9148, 12110, 12111, 12116, 12134, 12136,
12137, 12142, 12143, 12146, 12147, 13159), Indian..RACE1. = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA), Asian..RACE2. = c("", "Yes", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
""), Black..RACE3. = c("Yes", "", "", "Yes", "Yes", "Yes", "Yes",
"Yes", "", "Yes", "", "", "", "", "", "", "", "Yes", "Yes", "",
"", ""), Native.Hawaiian.or.other.Pacif..RACE4. = c(NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA), White..RACE5. = c("", "", "Yes", "", "", "", "",
"", "Yes", "", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"", "", "Yes", "Yes", "Yes"), Other.Race..RACE6. = c(NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA), Specify.Other.Race..RACEOTH. = c(NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA)), .Names = c("PLTID", "Indian..RACE1.", "Asian..RACE2.",
"Black..RACE3.", "Native.Hawaiian.or.other.Pacif..RACE4.", "White..RACE5.",
"Other.Race..RACE6.", "Specify.Other.Race..RACEOTH."), class = "data.frame", 
row.names = 43:64)

I would like to add a column that indicates which of the other columns contains 
?Yes?.  In other words, that column would contain:
        Black..RACE3.
        Asian..RACE2.
        White..RACE5.
        Black..RACE3.
        ?

Even better would be
        Black
        Asian
        White
        Black
        ?
(which I can accomplish with strsplit)

None of the rows contains more than one ?Yes? although it is possible that none 
of the entries in a row would be ?Yes? (in which case, the entry in the new 
column should be NA)

I could do this by looping through each of the columns with something like this:
        DATA$RACE               <- NA
        for (COL in 2:8)        DATA$RACE[which(DATA[,COL] == "Yes")] <- 
names(DATA)[COL]
But, I suspect that there is some more elegant way to accomplish this.

Thanks in advance.

Dennis

Dennis Fisher MD
P < (The "P Less Than" Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnew...@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating a new column from a series of columns

Reply via email to