Hello, Combine quantile() with findInterval(). Something like the following.
# sample data x <- rnorm(100) val <- c("Bottom 50", "20 to 50", "5 to 20", "Top 5%") qq <- quantile(x, probs = c(0, 0.50, 0.70, 0.95, 1)) idx <- findInterval(x, qq) val[idx] Hope this helps, Rui Barradas Em 31-07-2013 10:37, Dark escreveu:
Hi all, I think this should be an easy question for the guru's out here. I have this large data frame (2.500.000 rows, 15 columns) and I want to add a column named "SEGMENT" to it. The first 5% rows (first 125.000 rows) should have the value "Top 5%" in the SEGMENT column Then the rows from 5% to 20% should have the value "5 to 20" Then 20-50% should have the value "20 to 50" And the last 50% of the rows should have the value "Bottom 50" What is the easiest way of doing this? I was thinking of using quantile but then I should have some rownumber column. Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Add-a-column-to-a-data-frame-with-value-based-on-the-percentile-of-the-row-tp4672711.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.