[R] Why is it not possible to cut a tree returned by Agnes or Diana by height?

Leszek Nowina |eeko|n@n @end|ng |rom gm@||@com
Mon Apr 15 15:10:15 CEST 2019


Either way, it would seem to me that cutree(tree, h=height) could be
easily implemented as cutree(tree, k=sum(tree$height>height)+1) - why
isn't it?

Or is this not really the same, despite what seems to me?

pon., 15 kwi 2019 o 01:30 Bert Gunter <bgunter.4567 using gmail.com> napisał(a):
>
> Inline.
>
> Bert Gunter
>
>
> On Sun, Apr 14, 2019 at 4:12 PM Leszek Nowina <leekoinan using gmail.com> wrote:
>>
>>     > asdf = data.frame(x=c(1,2,3), y=c(4,5,6), z=c(7,8,9))
>>     > cutree(agnes(asdf), h=100)
>>     Error in cutree(agnes(asdf), h = 100) :
>>       the 'height' component of 'tree' is not sorted (increasingly)
>>     > cutree(diana(asdf), h=100)
>>     Error in cutree(diana(asdf), h = 100) :
>>       the 'height' component of 'tree' is not sorted (increasingly)
>>
>> I'm not sure if I understand why this is the case.
>>
>> This is what I want: Cluster stuff by the //distances//, **not** by
>> how many clusters I want to have.
>>
>> If two things are further from each other than X, they should go to
>> different clusters. Otherwise, the same cluster.
>>
>> Is it unreasonable what I'm asking for?
>
> Yes.
>
> X and Y are at a distance 2. Y and Z are at a distance 2. X and Z are at a distance 4. Your idea cannot be consistently applied if 3 is the cutoff for clustering: Xand Z would have to go in different clusters but both be in the same cluster as Y.
>
> Maybe you need to spend some time with the literature before trying to cook up your own notions.
>
> Cheers,
> Bert
>
>
>>
>> I image if I was to manually
>> implement Agnes or Diana this would go like that: stop joining
>> clusters if the smallest distance between any pair of clusters is
>> larger than X (Agnes) or stop dividing clusters if the largest cluster
>> has a diameter of X (Diana); but since both methods always join/divide
>> to the very end I thought using cutree with a height parameter would
>> give me what I need. It won't.
>>
>> Am I missing something?
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list