On Fri, Oct 23, 2009 at 3:58 AM, Dieter Menne <dieter.me...@menne-biomed.de> wrote: > > > > sdanzige wrote: >> >> >> I'm using the Hmisc cut2 function to bin a set of data. It produces bins >> that I like with results like this: >> >> [96,270]:171 >> [69, 96): 54 >> [49, 69): 40 >> [35, 49): 28 >> [28, 35): 14 >> [24, 28): 8 >> (Other) : 48 >> >> I would like to take a second set of data, and assign it to bins based on >> factors defined by my call to cut 2. >> > > It used to be quite tricky, but on popular request Brian Ripley has added an > example how to extract the intervals using regular expression on the bottom > of the examples for cut (note:cut in base, not cut2 in Hmisc). > > If someone knows of an easier way, please correct me. How about adding this > information as attribute to the standard cut? >
The strapply function in gsubfn can do it with a simpler regular expression since it extracts based on content rather than delimiters, which is what you want here: > # create sample data > library(gsubfn) > set.seed(1) > dat <- seq(4, 7, by = 0.05) > x <- sample(dat, 30) . > # use cut > groups <- cut(x, breaks = 10) > # extract interval boundaries using strapply > strapply(levels(groups), "[[:digit:].]+", as.numeric, simplify = TRUE) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 4.0 4.3 4.6 4.9 5.2 5.5 5.8 6.1 6.4 6.7 [2,] 4.3 4.6 4.9 5.2 5.5 5.8 6.1 6.4 6.7 7.0 The above is from demo("gsubfn-cut") For more see the gsubfn home page at http://gsubfn.googlecode.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.