[BioC] package for collapsing probe id to entrezid or gene symbols

Hervé Pagès hpages at fhcrc.org
Mon Mar 21 18:44:30 CET 2011


Hi Wendy,

FWIW, it seems to me that you can easily infer the mapping from
probes to Entrez ids by combining the information stored in the
<platform>probe and <platform>.db packages for a given platform.

For example, for the hgu95av2 platform, the hgu95av2probe package
contains the mapping between probes and probe set ids:

 > library(hgu95av2probe)
 > head(as.data.frame(hgu95av2probe))
                    sequence   x   y Probe.Set.Name 
Probe.Interrogation.Position
1 TGGCTCCTGCTGAGGTCCCCTTTCC 395 301        1138_at 
    2631
2 GGCTGTGAATTCCTGTACATATTTC 322 441        1138_at 
    2661
3 GCTTCAATTCCATTATGTTTTAATG 213 419        1138_at 
    2703
4 GCCGTTTGACAGAGCATGCTCTGCG 279 435        1138_at 
    2781
5 TGACAGAGCATGCTCTGCGTTGTTG 473 299        1138_at 
    2787
6 CTCTGCGTTGTTGGTTTCACCAGCT 587 205        1138_at 
    2799
   Target.Strandedness
1           Antisense
2           Antisense
3           Antisense
4           Antisense
5           Antisense
6           Antisense

(Note that, unlike the probe sets, the probes don't have ids, but are
uniquely identified by their x and y coordinates on the array.)

And the hgu95av2.db package contains the mapping between probe set ids
and Entrez ids:

 > library(hgu95av2.db)
 > get("1138_at", hgu95av2ENTREZID)
[1] "6574"
 > mget(keys(hgu95av2ENTREZID)[1:5], hgu95av2ENTREZID)
$`1000_at`
[1] "5595"

$`1001_at`
[1] "7075"

$`1002_f_at`
[1] "1557"

$`1003_s_at`
[1] "643"

$`1004_at`
[1] "643"

Cheers,
H.


On 03/15/2011 08:30 AM, Wendy Qiao wrote:
> Hi all,
>
> I am searching for a Bioconductor package that can collapse affymetrix probe
> id to gene symbols or entrez id, but I have not had any luck yet. Does
> anyone know any package that can collapse the probe id to gene symbols?
>
> Thank you in advance,
> Wendy
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list