[BioC] Annotation of U95av2 array

James W. MacDonald jmacdon at med.umich.edu
Wed Apr 18 15:50:53 CEST 2007


Hi George,

Please don't take list conversations off-list. The list archives are 
intended to be a source of information, and on the off chance that I 
might say something useful, it would be nice if people could find this 
later.

As to your question, as I said below, we just map things from Entrez 
Gene to the other annotation sources, so whatever Entrez Gene says, we 
report. So if I grep out some probeset ID that maps to multiple UniGene 
IDs, I might get something like 35566_f_at, which maps to 5 UG IDs.

Now if I get the Entrez ID (3576), go to the Entrez Gene webpage for 
this ID, and scroll to the very bottom, I see five UniGene IDs that this 
Entrez Gene ID corresponds to. We report four of these five, the only 
difference being we report Hs.443948 instead of Hs.654584.

This is obviously a mistake because Hs.443948 is SLC4A1 instead of IL-8, 
but the hgu95av2 package was built on March 15, so maybe Entrez Gene has 
corrected this mistake in the interim.

See 
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene&cmd=Retrieve&dopt=full_report&list_uids=3576

Best,

Jim


Tseng, George C. wrote:
> Jim,
> 
> Thanks so much for your response. I have one further question. In
> your annotation in Bioconductor, a probe set can map to multiple
> unigene ID. This really confuses me. Shouldn't it be only one ID?
> 
> George
> 
> -----Original Message----- From: James MacDonald
> [mailto:jmacdon at med.umich.edu] Sent: Sunday, April 01, 2007 9:59 AM 
> To: Tseng, George C. Cc: biocannotation at lists.fhcrc.org; Lu, Shu-Ya 
> Subject: Re: Annotation of U95av2 array
> 
> Hi George,
> 
> Tseng, George C. wrote:
> 
>> Dear Dr. MacDonald and other Biocore Data Team members,
>> 
>> I'm using your array annotations from Bioconductor in my research
>> and I teach it in my microarray course as well. It is indeed a
>> great tool for our data analysis and methodological development.
>> Recently we're working on a meta-analysis research project to
>> incorporate information from multiple data sets. My student took
>> the Unigene ID annotations in all the U95av2 probes and compared
>> with the result obtained from the Affymetrix website (the batch
>> search in NetAffy). Among the 9704 probes annotated in
>> Bioconductor, 724 probes were annotated completely differently in
>> NetAffy.
>> 
>> My question is: Do you obtain your Unigene ID annotation from
>> Affymetrix database or other source? NetAffy annotations always
>> have one Unigene ID to a probeset while your annotationis can have
>> many. Can you give us some detail about your annotation procedure?
> 
> 
> Nianhua Li makes the annotation packages, so she would be the final 
> trusted source.
> 
> In the past, the process was to map Affy ID to Entrez Gene ID using
> the annotation files that Affy supply on their website. We then use 
> AnnBuilder to do the mappings from Entrez Gene to all other
> annotation sources, so it is not inconceivable that we would have
> different UniGene IDs for a given probeset.
> 
> In my experience, the BioC annotations are more up to date and
> accurate than what Affy supply either on Netaffx or in their
> annotation files. This is based on blatting the probe sequences.
> 
> Best,
> 
> Jim
> 
> 
> 
>> Thanks!
>> 
>> George
>> 
>> ============================================ George C. Tseng 
>> Assistant Professor Dept of Biostatistics and Human Genetics, 
>> University of Pittsburgh http://www.pitt.edu/~ctseng,
>> 412-624-5318 ============================================
> 
> 


-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623


**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.



More information about the Bioconductor mailing list