[BioC] Entrez Gene ID to Probe Set Name
Marc Carlson
mcarlson at fhcrc.org
Fri Oct 24 19:42:43 CEST 2008
Hi Monnie,
This is pretty easy once you know about the revmap() function.
Here is a quick example:
library(hgu95av2.db)
mget("1557", revmap(hgu95av2ENTREZID))
Also, if you want to know more, you might want to look at the
AnnotationDbi vignette:
http://www.bioconductor.org/packages/2.4/bioc/html/AnnotationDbi.html
Marc
McGee, Monnie wrote:
> Here is the previous query with a more descriptive subject.
>
>
> -----Original Message-----
> From: McGee, Monnie
> Sent: Thu 10/23/2008 11:14 AM
> To: bioconductor at stat.math.ethz.ch
> Subject: RE: Bioconductor Digest, Vol 68, Issue 23
>
> Dear List,
>
> Is there an elegant way to obtain the name of a probe set from an Affymetrix platform (doesn't matter which one) corresponding to a given ENTREZ gene ID? It seems that it is fairly simple to obtain the entrez ID if you have a probe set, but the reverse problem seems non-trival -at least it is to me.
>
> There's no particular reason I need to know. I just want to know if it's possible.
>
> Thanks!
> Monnie
>
> Monnie McGee, Ph.D.
> Associate Professor
> Department of Statistical Science
> Southern Methodist University
> Ph: 214-768-2462
> Fax: 214-768-4035
>
>
>
> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch on behalf of bioconductor-request at stat.math.ethz.ch
> Sent: Thu 10/23/2008 5:00 AM
> To: bioconductor at stat.math.ethz.ch
> Subject: Bioconductor Digest, Vol 68, Issue 23
>
> Send Bioconductor mailing list submissions to
> bioconductor at stat.math.ethz.ch
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> or, via email, send a message with subject or body 'help' to
> bioconductor-request at stat.math.ethz.ch
>
> You can reach the person managing the list at
> bioconductor-owner at stat.math.ethz.ch
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Bioconductor digest..."
>
>
> Today's Topics:
>
> 1. GOstat: listing genes from hyperGTest (Tim Smith)
> 2. export toptables into Genespring (Pemmasani, Kalyani)
> 3. Re: Limma contrasts question (James W. MacDonald)
> 4. Re: GOstat: listing genes from hyperGTest (James W. MacDonald)
> 5. Re: Limma contrasts question (Daniel Brewer)
> 6. quality assessment and preprocessing for tiling array-based
> CGH data (Leon Yee)
> 7. GOstats and org.EcK12.eg.db (Robert Castelo)
> 8. Re: quality assessment and preprocessing for tiling
> array-based CGH data (Sean Davis)
> 9. Re: GOstat: listing genes from hyperGTest (Tim Smith)
> 10. Re: quality assessment and preprocessing for tiling
> array-based CGH data (Leon Yee)
> 11. Re: Beadarray and illumina methylation arrays (Mark Dunning)
> 12. Re: quality assessment and preprocessing for tiling
> array-based CGH data (Sean Davis)
> 13. Problem using Rgraphviz (edge weights going missing). (Dan Bolser)
> 14. Re: newbie problems with AnnBuilder (Mark Kimpel)
> 15. Re: newbie problems with AnnBuilder (Sean Davis)
> 16. Re: newbie problems with AnnBuilder (Mark Kimpel)
> 17. Re: GOstats and org.EcK12.eg.db (Robert Gentleman)
> 18. Re: quality assessment and preprocessing for tiling
> array-based CGH data (Leon Yee)
> 19. Bioconductor installation problem: unable to access
> repository (Shinichiro Wachi)
> 20. Re: quality assessment and preprocessing for tiling
> array-based CGH data (Sean Davis)
> 21. Re: GOstat: listing genes from hyperGTest (James W. MacDonald)
> 22. Re: Bioconductor installation problem: unable to access
> repository (Patrick Aboyoun)
> 23. Bioconductor 2.3 is released (Patrick Aboyoun)
> 24. Re: How to save result from limma (Jenny Drnevich)
> 25. scale questions (Hui-Yi Chu)
> 26. Re: [Fwd: batch info for cellHTS] (Florian Hahne)
> 27. problem with Category package and custom annotationDbi
> (Mark Kimpel)
> 28. Re: problem with Category package and custom annotationDbi
> (Marc Carlson)
> 29. Re: scale questions (Sean Davis)
> 30. Re: scale questions (Sean Davis)
> 31. Re: problem with Category package and custom annotationDbi
> (Mark Kimpel)
> 32. Re: How to save result from limma (Gordon K Smyth)
> 33. Package "xps" "import.expr.scheme" error (Wei,Caimiao)
> 34. Re: Lumi and Beadstudio 1.5.13 (Leon Peshkin)
> 35. Offre exceptionnelle suite au probl?me technique
> (Clara de Dessous Ch?ri)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 22 Oct 2008 03:43:33 -0700 (PDT)
> From: Tim Smith <tim_smith_666 at yahoo.com>
> Subject: [BioC] GOstat: listing genes from hyperGTest
> To: bioc <bioconductor at stat.math.ethz.ch>
> Message-ID: <257981.79114.qm at web58005.mail.re3.yahoo.com>
> Content-Type: text/plain
>
>
> Hi,
>
> I
> was performing a hyperGTest for genes in homo-sapiens. For a set of
> input genes, this function returns some 'significant' GO terms. What I
> wanted to now do was to co-relate each significant GO term (returned by
> this function) with genes (from my set of input genes) associated with
> that GO term. However, I think that I may be using the wrong
> package/function to get the releveant set of genes.
>
> Currently, what I'm doing is finding the significant GO terms by using the following code:
>
> -----------------------
> ### 'genes1' are the Entrez IDs of my genes of interest, and 'allGenes' is the universe of Entrez IDs
>
> paramsGO <- new("GOHyperGParams", geneIds = genes1,
> universeGeneIds = allGenes, annotation = "org.Hs.eg.db",
> ontology = "BP", pvalueCutoff = 1, conditional = FALSE,
> testDirection = "over")
>
> GO <- hyperGTest(paramsGO)
> --------------------------
> This
> gives me a set of significant GO terms. Now, I would like to find which
> subset of genes in 'genes1' is associated with each of the significant
> GO term. To do this I map all GO terms to their Entrez IDs using the
> 'org.Hs.eg.db' package using the following:
>
> xx <- as.list(org.Hs.egGO2EG)
>
> to
> get a mapping of GO terms to Entrez IDs. I get 6,756 GO terms (isn't
> this number small?) that map to at least one Entrez ID. So, from here I
> look up which Entrez IDs are associated with my GO term of interest.
>
> My
> problem is that often, the GO term from hyperGTest is not associated
> with any Entrez ID (using xx <- as.list(org.Hs.egGO2EG) described
> above ), i.e. the GO term/ID is not in the list obtained from
> 'org.Hs.egGO2EG'). For example, the term 'GO:0043284' is thrown up by
> hyperGTest, but does not appear to be associated with any Entrez IDs in
> the org.Hs.eg.db package. Where could I be going wrong?
>
> I would give a set of genes so that the example is reproducible, but [[elided Yahoo spam]]
>
> Thanks for any comments/suggestions. I realize that I'm probably doing something really stupid here....
>
> My sessionInfo() is:
> --------------------------------
> R version 2.7.2 (2008-08-25)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United
> States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] grid splines tools stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1]
> gplots_2.6.0 gmodels_2.14.1 gtools_2.4.0
> gdata_2.4.1 Rgraphviz_1.18.1 GOstats_2.6.0
> Category_2.6.0
> [8] RBGL_1.16.0 annotate_1.18.0
> xtable_1.5-2 graph_1.18.0 PFAM.db_2.2.0
> GO.db_2.2.0 KEGG.db_2.2.0
> [15] org.Hs.eg.db_2.2.0 AnnotationDbi_1.2.0 RSQLite_0.6-8 DBI_0.2-4 genefilter_1.20.0 survival_2.34-1 affy_1.18.0
> [22] preprocessCore_1.2.0 affyio_1.8.0 Biobase_2.0.0
>
> loaded via a namespace (and not attached):
> [1] cluster_1.11.11 MASS_7.2-44
>
>
> ---------------------------------
>
>
>
> [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> Message: 2
> Date: Wed, 22 Oct 2008 12:34:38 +0100
> From: "Pemmasani, Kalyani" <kalyani.pemmasani at nuigalway.ie>
> Subject: [BioC] export toptables into Genespring
> To: <bioconductor at stat.math.ethz.ch>
> Message-ID:
> <6B017AD2AE2F6F489087FC986588136B88FA42 at EVS1.ac.nuigalway.ie>
> Content-Type: text/plain; charset="iso-8859-1"
>
>
> Hi all,
>
> Is there a way to export toptables from LIMMA into Genespring software program (from Agilent technologies) for clustering?
>
> Best regards,
> Kalyani
> -------------------------------------------
> Kalyani Pemmasani
> Marie Curie research fellow
> National Diagnostics Centre
> National University of Ireland
> Galway, IRELAND
> e.mail: kalyani.pemmasani at nuigalway.ie
> Ph.no: +353(0)91492815
> Fax: +353 (0) 91586570
>
>
>
> ------------------------------
>
> Message: 3
> Date: Wed, 22 Oct 2008 09:07:16 -0400
> From: "James W. MacDonald" <jmacdon at med.umich.edu>
> Subject: Re: [BioC] Limma contrasts question
> To: Daniel Brewer <daniel.brewer at icr.ac.uk>
> Cc: bioconductor at stat.math.ethz.ch
> Message-ID: <48FF2584.5010509 at med.umich.edu>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Daniel Brewer wrote:
>
>
>> Hi Jim,
>>
>> Could you go into the maths of the contrast formulas a bit? I would
>> like to get a really solid understanding of what it is doing for future
>> analyses.
>>
>
> Once you understand what the coefficients are, the contrasts are just
> simple algebra. In your case, all of the coefficients are estimating the
> difference between the sample and PC3M (e.g., Knockdown - PC3M).
>
> So the algebra is something like this:
>
> 2(Knockdown - PC3M) - (Scramble - PC3M)
> =
> 2Knockdown - 2PC3M - Scramble + PC3M
> =
> 2Knockdown - Scramble - PC3M
> =
> Knockdown - (Scramble + PC3M)/2
>
> Which is knockdown minus the mean of the controls.
>
> Note that this will be the numerator of the resulting t-statistic. The
> denominator will be sort of an average of the variability within each of
> the three groups being compared. So the question being answered is 'What
> genes are different in Knockdown as compared to the average of the
> controls?'. However, there is nothing here to test if the two controls
> are similar at all (and you might not care).
>
> So for instance, you might have a gene with average expression like this:
>
> Knockdown = 10
> PC3M = 4
> Scramble = 7
>
> If the intra-group variability is small for each sample type, then you
> will likely get a significant t-statistic even though the two controls
> are probably significantly different as well. Which is why I mentioned
> earlier that you might want to test the Scramble - PC3M contrast as well.
>
> Best,
>
> Jim
>
>
>
>> Many thanks
>>
>> Dan
>>
>>
>
>
More information about the Bioconductor
mailing list