[R] Extracting all members with a specific similarity value from a large similarity matrix

Buergmann, Helmut Helmut.Buergmann at eawag.ch
Tue Mar 23 11:40:44 CET 2010


I have a large dataframe (1400x1400) containing a symmetric similarity matrix. Now I would like to extract subsets of elements where all elements have a specific similarity with all other elements of this subset. 
For example if the data looks like this

	Spl1	Spl2	Spl3	Spl4	Spl5	[...]
Spl1	1	0.125	0.000	0.000	0.125
Spl2	0.125	1	0.000	0.000	0.125
Spl3	0.000	0.000	1	0.000	0.500
Spl4	0.000	0.000	0.000	1	0.750
Spl5	0.125	0.125	0.500	0-750	1
[...]

I am looking for a way to either like to extract, all elements that are mutually 0, e.g:
	Spl1	Spl3	Spl4	[...]
Spl1	1	0.000	0.000
Spl3	0.000	1	0.000
Spl4	0.000	0.000	1
[...]

Or that mutually have similarity 0.125:

	Spl1	Spl2	Spl5	[...]
Spl1	1	0.125	0.125
Spl2	0.125	1	0.125
Spl5	0.125	0.125	1
[...]

Or alternatively to sort the table so that this info can easily be obtained by looking for blocks around the diagonal, like this:
	Spl3	Spl4	Spl1	Spl2	Spl5	[...]
Spl3	1	0.000	0.000	0.125	0.500
Spl4	0.000	1	0.000	0.000	0.750
Spl1	0.000	0.000	1	0.125	0.125
Spl2	0.000	0.000	0.125	1	0.125
Spl5	0.500	0.750	0.125	0.125	1
[...]

Any help is much appreciated!

Helmut Bürgmann, Switzerland



More information about the R-help mailing list