[R] How to automate the detection of break points for use in cut

Duncan Murdoch murdoch.duncan at gmail.com
Tue Dec 6 13:28:53 CET 2011


On 11-12-06 3:34 AM, Sébastien Bihorel wrote:
> Obviously, cut would do the job if one knows the number of intervals in
> advance, which I assume I won't. I guess what I'm looking for is a function
> that figures out the number of intervals and their boundaries.

That's not really a simple problem, but there are functions that do 
clustering and fit mixture models to data, which might be close enough. 
  See the Cluster task view at 
http://cran.r-project.org/web/views/Cluster.html.

Duncan Murdoch

>
> Sebastien
>
> On Tue, Dec 6, 2011 at 3:29 AM, Sébastien Bihorel<pomchip at free.fr>  wrote:
>
>> Dear R-users,
>>
>> I would like to know if there is a function (in base R or the extension
>> packages) that would automatically detect the break points in a vector x
>> for later use in the cut function. The idea is to determine the boundaries
>> of the n intervals (n>=1) delimiting clusters of data points which could be
>> considered "reasonably" close, given a numerical vector x with unknown
>> content and unknown multimodal distribution.
>>
>> For instance, given for the vector x defined by set.seed(1234); x<-
>> sort(c(rnorm(20,-1,0.1),rnorm(
>> 10,5,0.1),rnorm(10,100,0.1))), this function would return a vector of 4
>> points: min(x), one value between 20 and 5, one value between 5 and 100,
>> and max(x).
>>
>> Thank you in advance for your suggestions.
>>
>> Sebastien
>>
>
> 	[[alternative HTML version deleted]]
>
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list