[BioC] similarity between two gene lists with varied length
Shannon, William
WSHANNON at dom.wustl.edu
Sun Aug 24 03:15:50 CEST 2008
First thought is a similarity can be based on the ratio of the number of genes in the intersection of the two lists divided by the number of genes in the union of the two lists. If the two lists are identical the similarity is 1 and if they have no genes in common they have a similarity of 0. Of course this won't take into account the length of the gene lists.
You would have to think through what would happen to the similarity for cases where some genes are in both lists.
Bill Shannon
Associate Professor of Biostatistics in Medicine
Washington University School of Medicine
President-Elect, Classification Society
________________________________________
From: bioconductor-bounces at stat.math.ethz.ch [bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Weiwei Shi [helprhelp at gmail.com]
Sent: Saturday, August 23, 2008 7:55 PM
To: r-help at stat.math.ethz.ch
Cc: Bioconductor
Subject: [BioC] similarity between two gene lists with varied length
Dear listers,
a little off-topic:
I am looking for and compare algorithms which can calculate "distance" or
"similarity" between two gene lists with different lengths.
Any paper, any implementation in R and any suggestion is welcome!
Thanks,
--
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.
"Did you always know?"
"No, I did not. But I believed..."
---Matrix III
[[alternative HTML version deleted]]
_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list