[BioC] Odds Ratio in GOstat [resolved?]
Naomi Altman
naomi at stat.psu.edu
Tue Dec 12 05:12:47 CET 2006
The duplicate genes problem is an interesting one. The reason the
selected gene list includes duplicates is because it comes from
blasting an EST set from an unsequenced species against a sequenced
species. The duplicates are supposed to be the nearest homolog of
the EST but to represent multiple genes. How to handle this for GO
enrichment is an interesting question.
e.g. Annotation has genes A B C.
We observe that matches A1 A2 and B1 are upregulated, but B2 and C
are not. Should we say that 3 out of 5 are upregulated, or 2 out of 3?
--Naomi
At 07:43 PM 12/11/2006, Seth Falcon wrote:
>The selected gene list contained duplicate ids. I'm pretty sure this
>is the problem. The Category + GOstats code should detect such input
>errors and give a sensible error message. I will add such checking
>very soon.
>
>+ seth
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives:
>http://news.gmane.org/gmane.science.biology.informatics.conductor
Naomi S. Altman 814-865-3791 (voice)
Associate Professor
Dept. of Statistics 814-863-7114 (fax)
Penn State University 814-865-1348 (Statistics)
University Park, PA 16802-2111
More information about the Bioconductor
mailing list