[BioC] where to find a list of housekeeping genes for Affy array (Gene 1.0 ST)
James W. MacDonald
jmacdon at uw.edu
Fri Apr 19 15:39:29 CEST 2013
On 4/19/2013 9:25 AM, Robert Castelo wrote:
> hi,
>
> if you're searching from something readily available in R, you can try
> to do following:
>
> library(BiocInstaller) ## assuming you installed some BioC package once
> biocLite("tweeDEseqCountData")
>
> library(tweeDEseqCountData)
> data(hkGenes)
> length(hkGenes)
> [1] 669
> head(hkGenes)
> [1] "ENSG00000149925" "ENSG00000102144" "ENSG00000142676"
> "ENSG00000108298"
> [5] "ENSG00000144713" "ENSG00000075624"
>
> check out the help page of 'hkGenes' for the source publication of
> this list.
>
> as you see, these are Ensmbl gene identifiers, but if you need Affy
> IDs from a particular Affy chip, let's say HG-U133 Plus 2.0, you can
> use the great identifier mapping functionality of the package
> GSEABase, which in principle is designed to map identifier between
> gene sets and ExpressionSet objects, but which you can tweak to do
> this job for you passing the housekeeping gene list as if it were one
> gene set:
>
> library(GSEABase)
>
> dummygs <- GeneSet(hkGenes, geneIdType=ENSEMBLIdentifier())
>
> hkGenesHGU133plus2AffyIDs <- geneIds(mapIdentifiers(dummygs,
> AnnoOrEntrezIdentifier("hgu133plus2")))
>
> length(hkGenesHGU133plus2AffyIDs)
> [1] 1263
>
> head(hkGenesHGU133plus2AffyIDs)
> [1] "200966_x_at" "214687_x_at" "238996_x_at" "1558365_at" "200737_at"
> [6] "200738_s_at"
Or you could use more direct methods:
library(hgu133plus2.db)
mapped.genes <- select(hgu133plus2.db, hkGenes, "PROBEID", "ENSEMBL")
head(mapped.genes)
ENSEMBL PROBEID
1 ENSG00000149925 200966_x_at
2 ENSG00000149925 214687_x_at
3 ENSG00000149925 238996_x_at
4 ENSG00000102144 1558365_at
5 ENSG00000102144 200737_at
6 ENSG00000102144 200738_s_at
Best,
Jim
>
>
> cheers,
> robert.
>
>
> On 04/18/2013 02:03 AM, Jack Luo wrote:
>> Not sure whether it's an appropriate question for Bioconductor. Is
>> there a
>> place to find a list of housekeeping genes (identified by Affy)?
>>
>> Thanks,
>>
>> -Jack
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list