[BioC] which universe in hyperGTest
James W. MacDonald
jmacdon at med.umich.edu
Thu Jan 11 17:14:10 CET 2007
marco zucchelli wrote:
> Hi,
>
> I am using hyperGTest to test GO. Fot the universe of genes I use the
> follwing:
>
> ENTREZ <- as.list(hgu133plus2ENTREZID)
> univ <- unlist(ENTREZ)
> univ <- univ[!is.na(univ)]
>
> now in univ there are 47,430 genes but only 19,871 are unique, since the
> same gene can be hybrydized several times on the same array.
>
>
>>length(univ)
>
> [1] 47430
>
>>length(unique(univ))
>
> [1] 19871
>
>
> Is it correct to have repetitions or should a list of unique genes be used?
> i.e. should I use:
>
> universeGeneIds=univ or
> universeGeneIds=unique(univ)
You want the unique genes. In addition, if you have done any
pre-filtering of the data to remove e.g., those genes that don't change
expression, you want to remove those from your universe as well. The
universe should only consist of unique genes that could have been
selected by whatever statistical test you used. BTW, the geneIds should
also be unique.
Seth has made some changes to the GOstats vignette that should make all
of this quite clear:
http://www.bioconductor.org/packages/2.0/bioc/vignettes/GOstats/inst/doc/GOstatsHyperG.pdf
Best,
Jim
>
> in the GOHyperGParams ??
>
> Regards
>
> Marco
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
More information about the Bioconductor
mailing list