[R] Is it possible to obtain an agglomeration schedule with R cluster analyis
William Dunlap
wdunlap at tibco.com
Sun Feb 24 00:27:31 CET 2013
You didn't show what the tabular summary should look like.
However, look at the height and merge components of
an hclust object:
> hc3 <- hclust(dist(USArrests[1:8, c(1,2,4)]))
> data.frame(hc3[2:1])
height merge.1 merge.2
1 9.297849 -1 -8
2 13.609188 -2 -5
3 23.779193 -4 -6
4 33.865321 -3 2
5 48.229659 1 3
6 104.636227 4 5
7 185.135221 -7 6
The two merge.* columns identify what groups merged at
the corresponding height value. Negative values, i, refer to the
-i'th leaf value in the 'labels' component and positive values, i, refer
to cluster created in the i'th row of the data.frame. The following
function transforms those references into name:
f <- function(hc){
data.frame(row.names=paste0("Cluster",seq_along(hc$height)),
height=hc$height,
components=ifelse(hc$merge<0, hc$labels[abs(hc$merge)], paste0("Cluster",hc$merge)),
stringsAsFactors=FALSE)
}
as in
> f(hc3)
height components.1 components.2
Cluster1 9.297849 Alabama Delaware
Cluster2 13.609188 Alaska California
Cluster3 23.779193 Arkansas Colorado
Cluster4 33.865321 Arizona Cluster2
Cluster5 48.229659 Cluster1 Cluster3
Cluster6 104.636227 Cluster4 Cluster5
Cluster7 185.135221 Connecticut Cluster6
Compare that to the output of str(as.dendrogram(hc3)):
> str(as.dendrogram(hc3))
--[dendrogram w/ 2 branches and 8 members at h = 185]
|--leaf "Connecticut"
`--[dendrogram w/ 2 branches and 7 members at h = 105]
|--[dendrogram w/ 2 branches and 3 members at h = 33.9]
| |--leaf "Arizona"
| `--[dendrogram w/ 2 branches and 2 members at h = 13.6]
| |--leaf "Alaska"
| `--leaf "California"
`--[dendrogram w/ 2 branches and 4 members at h = 48.2]
|--[dendrogram w/ 2 branches and 2 members at h = 9.3]
| |--leaf "Alabama"
| `--leaf "Delaware"
`--[dendrogram w/ 2 branches and 2 members at h = 23.8]
|--leaf "Arkansas"
`--leaf "Colorado"
Does f() produce the information you need for your display?
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Bob Green
> Sent: Saturday, February 23, 2013 12:49 PM
> To: Uwe Ligges
> Cc: r-help at r-project.org
> Subject: Re: [R] Is it possible to obtain an agglomeration schedule with R cluster analyis
>
> Hello Uwes,
>
> Thanks. Re-reading the hclust pages I found that using the hclust
> 'USArrests' data that the command > plot (hc1) will generate the
> order in which cases joined. however, I still can't see how to obtain
> the respective height at which each case joined each cluster or the
> height when clusters merge.
>
>
> The dendrogram {stats} page provides the following code which
> produces the information that I require. However, what I would like
> to obtain is a table of the height at which cluster formed.
>
> > hc <- hclust(dist(USArrests), "ave")
> > (dend1 <- as.dendrogram(hc)) # "print()" method
> > str(dend1) # "str()" method
>
> I also found as.hclust which plots what I want, but I still can't
> find a way to produce the actual height values which are being
> plotted, for example as a tabular summary.
>
> plot(hc) ; mtext("hclust", side=1)
>
> Any assistance is appreciated,
>
> Bob
>
>
>
> At 04:01 AM 24/02/2013, Uwe Ligges wrote:
>
>
> >On 22.02.2013 11:41, Bob Green wrote:
> >>Hello,
> >>
> >>In SPSS the cluster analysis output includes an agglomerations schedule,
> >>which details the stages when cases are joined.
> >>
> >>Is it possible to obtain such output when performing cluster analysis in
> >>R? If so, I'd appreciate advice regarding how to obtain this information.
> >
> >
> >If you are talking about hierarchical clustering via hclust(), see ?hclust
> >It tells you that the relevant information is available inside the
> >object and you can even see it via the plot method.
> >
> >Uwe Ligges
> >
> >
> >
> >>
> >>Any assistance is appreciated,
> >>
> >>Regards
> >>
> >>Bob
> >>
> >>______________________________________________
> >>R-help at r-project.org mailing list
> >>https://stat.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide
> >>http://www.R-project.org/posting-guide.html
> >>and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list