[R] similarity matrix conversion to dissimilarity
Doran, Harold
HDoran at air.org
Wed Dec 8 23:43:38 CET 2004
Dear Sir:
I posed a similar question a few months back and received many
responses. Check the searchable archives at R Cran for those helpful
email. I did a search for 'similarity matrix' and many results were
returned.
Harold
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Dr. Thomas
Isenbarger
Sent: Wednesday, December 08, 2004 4:12 PM
To: r-help at stat.math.ethz.ch
Subject: [R] similarity matrix conversion to dissimilarity
I have a matrix of similarity scores that I want to convert into a
matrix of dissimilarity scores so that I can apply some clustering
methods to the data. That is, high values in my matrix signify
similarity and low values (zero being the lowest) signify no similarity.
What functions/options in R or its packages are available for making
this kind of transformation of a matrix?
Specifically, I am a molecular biologist. I have a set of 700+
nucleotide sequences i want to group into clusters based on sequence
similarities. There is a wide range of sequences in the set, some of
which are homologous to other sequences in the set. I want to use
clustering to identify these groups.
If the sequences were related and good be trimmed to the same length, I
would do an alignment and then use phylip (or some other distance
method) to create a distance matrix, but since my sequences are
unrelated and cannot be trimmed to the same length, I am at a loss for
what to do.
For a set with so many unrelated sequences of different lengths, the
only thing I have been able to is an all-against-all BLAST to create the
matrix, but this gives high scores for similarities, not high scores for
dissimilarities. The only thought I had was to use the reciprocal of
the BLAST score as some perverse measure of distance.
I am not subscribed to the list, so can I ask for responses directly to
my email address?
Thank-you,
Tom Isenbarger
--
isen at plantpath.wisc.edu
thomas a isenbarger
(608) 265-0850
______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list