This faster than your version, and doesn't return NA: df$htn <- apply(df[,2:4], 1, function(x)any(grepl("^410", x)))
> df ID DX1 DX2 DX3 htn 1 1 4109 4280 7102 TRUE 2 2 734 311 490 FALSE 3 3 4011 42822 4101 TRUE > system.time({ + for(j in 1:10000) { + for (i in 1:nrow(df)) { + df[i,"htn"] <- any(sapply('410', function(x) which( grepl(x, df[i, 2:4], fixed = TRUE) ))) + } + } + }) user system elapsed 6.648 0.008 6.657 There were 50 or more warnings (use warnings() to see the first 50) > > > > system.time({ + for(j in 1:10000) { + df$htn <- apply(df[,2:4], 1, function(x)any(grepl("^410", x))) + } + }) user system elapsed 1.826 0.000 1.826 On Mon, Jun 15, 2015 at 4:12 PM, Federman, Douglas <douglas.feder...@utoledo.edu> wrote: > I'm trying to do the following: search each patient's list of diagnoses for a > specific code then create a new column based upon the the presence of the > specific code. > Simplified data follows: > > con <- textConnection(" > ID DX1 DX2 DX3 > 1 4109 4280 7102 > 2 734 311 490 > 3 4011 42822 4101 > ") > df <- read.table(con, header = TRUE, strip.white = TRUE, > colClasses="character") > # > # I would like to add a column such the result of searching for 410 would > give: The search string would always be at the start of a word and doesn't > need regex. > # > # ID DX1 DX2 DX3 htn > # 1 4109 4280 7102 1 > # 2 734 311 490 0 > # 3 4011 42822 4101 1 > # > # The following works but is slow and returns NA if the search string is not > found: > > for (i in 1:nrow(df)) { > df[i,"htn"] <- any(sapply('410', function(x) which( grepl(x, df[i, 2:4], > fixed = TRUE) ))) > } > > Thanks in advance. I never fail to learn new things from this list. > -- Sarah Goslee http://www.functionaldiversity.org ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.