[BioC] athPkgBuilder data source :missing probesets

Björn Usadel usadel at mpimp-golm.mpg.de
Thu Aug 17 15:01:24 CEST 2006

Hi Nainhua,

as Tine pointed out I would suggest you choose multiple....

or you do it the hard way. For our purposes (MapMan visualization based 
on classification) I test, if the multiple genes hit by one probeset, 
have a similar function. If this is the case I mix the annotations 
assuming that it might be a [diverged] gene family, in which case there 
might be some information left (Affy used to tag them _s_, but affy is 
way outdated) when I sample the whole class.

However, if a probesets turns out to hit genes of different classes [non 
gene families ancient _x_ tag] (e.g. glycolysis and say proteasom 
dependent degradation)  I annotate the probeset as "hitting multi" and 
put it in a special "non-evaluate able" class.

You could also try to determine if it is really a gene family that is 
hit, in which case the annotations would be similar as well anyway.

But that is a lot of querying and eventually needs manual interaction.

Thanks for your work.


Nianhua Li wrote:
> Hi, Tine, Bjorn, Thomas and other Arabidopsis experts,
> Thanks a lot for the feedbacks. I will get the update done this week if you
> could help me to solve the following problem :P
> In TAIR's probe-to-locus mapping file, for example 
> ftp://ftp.arabidopsis.org/home/tair/
> Microarrays/Affymetrix/affy_ATH1_array_elements-2006-07-14.txt
>  some probesets are mapped to >= 1 locus. However, in annotation packages  
> ath1121501 and ag, all annotations (e.g. agCHRLOC, agENZYME) are indexed by
> probeset identifier. It assumes a one-to-one mapping between probeset and gene,
> so that the annotation to a gene is the annotation to a probeset. 
> How to handle the one probeset to multiple locus mappings? I can think 3
> possible solutions:
> 1. pick the "best" locus, but how?
> 2. mix the annotations to all mapped locus together
> 3. set to NA
> Any suggestions are highly appreciated. Many thanks!
> nianhua
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

Björn Usadel, PhD

Max Planck Institute of Molecular Plant Physiology
System Regulation Group

Am Mühlenberg 1
D-14476 Golm

Tel    (+49 331) 567-8114

Email  usadel at mpimp-golm.mpg.de
WWW    mapman.mpimp-golm.mpg.de

More information about the Bioconductor mailing list