[BioC] linkage distances

Daniel Brewer daniel.brewer at icr.ac.uk
Thu Jun 14 15:41:10 CEST 2007


Thanks to all of you for your responses, that is really helpful.

Dan

Jarno Tuimala wrote:
> Dear Daniel,
> 
> Package ape in CRAN contains functions one can use for calculating a 
> consensus of several dendrograms (consensus). Consensus function can 
> produce either a strict or majority rule consensus. Strict consensus 
> contains only the groups that are present in all the trees, whereas 
> majority rule consensus contains only the trees that are present in the 
> majority of the trees. I've usually used majority rule consensus, ans its 
> the standard method used with bootstrapping analyses.
> 
> Jarno
> 
> 
> 
> On Wed, 13 Jun 2007, Thomas Girke wrote:
> 
>> Dear Daniel,
>>
>> The only reference that I know that addresses this topic to some extend is
>> this book:
>> 	The Elements of Statistical Learning
>> 	by T. Hastie, R. Tibshirani, J. H. Friedman
>>
>>
>> With regard to William's suggestion: I don't have anything available that would
>> calculate the consensus between different denrograms. As a start to compute these
>> comparisons, I would loop over the height component in the hclust objects
>> with the cutree function. This way one can obtain all possible clusters
>> defined by each dendrogram and then perform all-against-all consensus comparisons
>> between different dendrograms using one of the intersect functions (e.g. %in%).
>>
>> # For example:
>> y <- matrix(rnorm(50), 10, 5, dimnames=list(paste("g", 1:10, sep=""), paste("t", 1:5, sep="")))
>> hr <- hclust(dist(y, method = "euclidean") )
>> sapply(hr$height, function(x) cutree(hr, h=x))
>>
>>
>> Thomas
>>
>>
>> On Wed 06/13/07 06:25, William Shannon wrote:
>>> I tend to use a 'consensus' approach when doing cluster analysis.  If by linkage distance you mean genetic linkage (I assume you do), you could try the various linkage distances and see if the dendrogram is stable.  This also works if you are dealing with non-genetic distance measures.
>>>
>>> If you do this and the dendrograms are essentially stable you are done.  More formal methods of consensus trees (dendrograms) can be found doing a search on work by Fred McMorris (look in discrete math and evolutionary biology) and the numerical taxonomy software PAUP I believe has consensus methods in it.
>>>
>>> Maybe Tom Girke has consensus tools in R/Bioconductor.
>>>
>>> Bill Shannon
>>> Washington Univ. School of Medicine
>>>
>>> PS -- I am running for President elect of the Classification Society of North America and encourage anyone doing cluster/classification work to look at this society for their research and publications (Journal of Classification and http://www.classification-society.org/csna/csna.html)
>>>
>>>
>>>
>>> Daniel Brewer <daniel.brewer at icr.ac.uk> wrote: Hi,
>>>
>>> I have been producing some dendograms using hclust with a variety of
>>> linkage distance measures.  Does anyone know or is there a good resource
>>> that explains why one would use one linkage distance rather than another?
>>>
>>> I don't really like dealing with dendograms, but we want to produce
>>> groupings based on these to do differential analysis on, and I would
>>> like to be able to justify it.
>>>
>>> Thanks
>>>
>>> Dan
>>>

-- 
**************************************************************
Daniel Brewer, Ph.D.
Institute of Cancer Research
United Kingdom
**************************************************************

The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.

This e-mail message is confidential and for use by the addre...{{dropped}}



More information about the Bioconductor mailing list