Hi, I am using R's grep function to find patterns in vectors of strings. The number of patterns I would like to match is 7,700 (of different sizes). I noticed that I get an error message when I do the following:
data <- array() for (j in 1:length(x)) { array[j] <- length(grep(paste(patterns[1:7700], collapse = "|"), x[j], value = T)) } When I break this up into 4 chunks of patterns it works: data <- array() for (j in 1:length(x)) { array$chunk1[j] <- length(grep(paste(patterns[1:2500], collapse = "|"), x[j], value = T)) array$chunk1[j] <- length(grep(paste(patterns[2501:5000], collapse = "|"), x[j], value = T)) array$chunk1[j] <- length(grep(paste(patterns[5001:7500], collapse = "|"), x[j], value = T)) array$chunk1[j] <- length(grep(paste(patterns[7501:7700], collapse = "|"), x[j], value = T)) } My questions: what's the maximum size of the patterns argument in grep? Is there a way to do this faster? It is very slow. Thanks. Math Sorry for not providing a reproducible example. It's a size issue which makes it difficult to provide an example. -- View this message in context: http://r.789695.n4.nabble.com/Maximum-number-of-patterns-and-speed-in-grep-tp4635613.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.