On 11-12-06 3:34 AM, Sébastien Bihorel wrote:
Obviously, cut would do the job if one knows the number of intervals in
advance, which I assume I won't. I guess what I'm looking for is a function
that figures out the number of intervals and their boundaries.

That's not really a simple problem, but there are functions that do clustering and fit mixture models to data, which might be close enough. See the Cluster task view at http://cran.r-project.org/web/views/Cluster.html.

Duncan Murdoch


Sebastien

On Tue, Dec 6, 2011 at 3:29 AM, Sébastien Bihorel<pomc...@free.fr>  wrote:

Dear R-users,

I would like to know if there is a function (in base R or the extension
packages) that would automatically detect the break points in a vector x
for later use in the cut function. The idea is to determine the boundaries
of the n intervals (n>=1) delimiting clusters of data points which could be
considered "reasonably" close, given a numerical vector x with unknown
content and unknown multimodal distribution.

For instance, given for the vector x defined by set.seed(1234); x<-
sort(c(rnorm(20,-1,0.1),rnorm(
10,5,0.1),rnorm(10,100,0.1))), this function would return a vector of 4
points: min(x), one value between 20 and 5, one value between 5 and 100,
and max(x).

Thank you in advance for your suggestions.

Sebastien


        [[alternative HTML version deleted]]




______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to