[BioC] A question about the gage package

James W. MacDonald jmacdon at uw.edu
Mon Mar 25 19:10:00 CET 2013


Hi Yiwen He,


On 3/25/2013 1:49 PM, He, Yiwen (NIH/CIT) [C] wrote:
> Hi,
>
> I am looking into using the gage package for gene set analysis. I would like to test run it on the human diabetic muscle microarray data used in the initial description of GSEA paper. I downloaded the expression data (Diabetes_hgu133a.gct) from the Broad institute website, and also downloaded the C2 gene set there (c2.symbols.gmt). However, the IDs in the expression dataset are Affymetrix probe IDs, while the IDs in the gene set are gene symbols (or Entrez gene IDs if I download another version.)
>
> Your manual says these two IDs should match, and I understand that. But what should I do when they don't match? The examples given in the manual have everything setup the right way already.

You should change them so they do match. Probably the easiest is to 
convert to Gene IDs. To convert, you want to use the hgu133a.db package 
and select().

Something like

egids <- select(hgu133a.db, <vector of probe IDs here>, "ENTREZID")

will give you a data.frame with the probeset IDs ans Gene IDs that you 
can use.

Best,

Jim


>
> I'm using R version 2.15.2 and gage_2.8.0 on Platform: i386-w64-mingw32/i386 (32-bit).
>
> Thank you very much!
>
> Yiwen He
> DCB/CIT/NIH/HHS
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list