> From a dataframe there are 27 variables of interest, with the > prefix of "pre". > > [7] "Decision" "MHCDate" "pre01" "pre01111" "pre012" "pre013" > [13] "pre02" "pre02111" "pre02114" "pre0211" "pre0212" "pre029" > [19] "pre03a" "pre0311" "pre0312" "pre03" "pre04" "pre05" > [25] "pre06" "pre07" "pre08" "pre09" "pre10" "pre11" > [31] "pre12" "pre13" "pre14" "pre15" "pre16" > > I want to combine these variables into new variables, using the > following criteria : > > (1) create a single variable PRE, when any of the 27 'pre' variables > have a value >= '1' > (2) create a variable HOM, when any of the pre01, pre01111, pre012, > pre013 variables have a value >= '1' > (3) create a variable ASS, when any of the pre02, pre02111, pre02114, > pre0211, pre0212, pre029 variables have a value >= '1' > (4) create a variable SEX, when any of the pre03a, pre0311, pre0312, > pre03 variables have a value >= '1' > (5) create a variable VIO, when any of the pre01 to pre06 variables > have a value >= '1' > (6) create a variable SERASS. If pre02111 or pre2114 >= '1', assign a > value of 1, if there is a value of 1 or greater for pre0211 assign a > value of 2; & if there is a value of > 1 or greater for pre0212: assign a value of 3; if there is a value > of 1 or greater for pre2029 assign a value of 4; everything else = 0. > If a case has multiple values, 02111 prevails over 2114, 2114 > prevails over 0211, 0211 prevails over 0212; 0212 prevails over 2029. > > > I believe I can generate new variables (1) - (5) using code such > as: ASS <- (reoffend$pre02 | reoffend$pre02111 | reoffend$pre02114 | > reoffend$pre0211 | reoffend$pre0212 | reoffend$pre029 >= '1') > > > I have three questions: > > 1. If this is correct, what is the most efficient way to generate (1) > without having to type all the variable names. The following does not > work: PRE <- reoffend [,9:35], >= '1'
Try something like this (data frame simplified): df <- data.frame(pre1=c(0,1,1,2), pre2=c(0,0,1,0), foo=c(0,0,1,3)) precols <- grep("pre", names(df)) gt1 <- function(x) x>=1 PRE <- apply(apply(df[,precols], 2, gt1), 1, any) > 2. I am unsure as to how to generate Example 6. SERASS <- rep(0, nrow(df)) SERASS[df$pre2029>=1] <- 4 SERASS[df$pre0212>=1] <- 3 SERASS[df$pre0211>=1] <- 2 SERASS[df$pre02111>=1 | df$pre2114>=1] <- 1 > 3. I wanted to exclude cases with a reoffend$Decision of value of 3, > using the code below. However, I received a message saying there were > NAs produced, however, the raw variable did not have NAs. > > > MHT.decision <- reoffend[reoffend$Decision >= '2',] > > table(MHT.decision) > Error in vector("integer", length) : vector size cannot be NA > In addition: Warning messages: > 1: NAs produced by integer overflow in: pd * (as.integer(cat) - 1L) > 2: NAs produced by integer overflow in: pd * nl > > > table(reoffend$Decision) > 1 2 3 > 1136 445 66 I doubt that you want quotes around the '2' when defining MHT.decision. Regards, Richie. Mathematical Sciences Unit HSL ------------------------------------------------------------------------ ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.