Hello, I have a large dataframe (1 000 000 rows, 1000 columns) where the columns contain a character. I would like to determine the most common character for each row. In the example below, I can parse one row at the time and find the most common character (apart for ties...). But I think this will be very slow and memory consuming. Is there a way to run it more efficiently? Thank you
``` V = c("A", "B", "C", "D") df = data.frame(n = 1:10, col_01 = sample(V, 10, replace = TRUE, prob = NULL), col_02 = sample(V, 10, replace = TRUE, prob = NULL), col_03 = sample(V, 10, replace = TRUE, prob = NULL), col_04 = sample(V, 10, replace = TRUE, prob = NULL), col_05 = sample(V, 10, replace = TRUE, prob = NULL), stringsAsFactors = FALSE) q = vector() for(i in 1:nrow(df)) { x = as.vector(t(df[i,2:ncol(df)])) q[i] = names(which.max(table(x))) } df$most = q ``` ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.