[BioC] athPkgBuilder data source :missing probesets
Björn Usadel
usadel at mpimp-golm.mpg.de
Thu Aug 17 15:01:24 CEST 2006
Hi Nainhua,
as Tine pointed out I would suggest you choose multiple....
or you do it the hard way. For our purposes (MapMan visualization based
on classification) I test, if the multiple genes hit by one probeset,
have a similar function. If this is the case I mix the annotations
assuming that it might be a [diverged] gene family, in which case there
might be some information left (Affy used to tag them _s_, but affy is
way outdated) when I sample the whole class.
However, if a probesets turns out to hit genes of different classes [non
gene families ancient _x_ tag] (e.g. glycolysis and say proteasom
dependent degradation) I annotate the probeset as "hitting multi" and
put it in a special "non-evaluate able" class.
You could also try to determine if it is really a gene family that is
hit, in which case the annotations would be similar as well anyway.
But that is a lot of querying and eventually needs manual interaction.
Thanks for your work.
Cheers,
Björn
Nianhua Li wrote:
> Hi, Tine, Bjorn, Thomas and other Arabidopsis experts,
>
> Thanks a lot for the feedbacks. I will get the update done this week if you
> could help me to solve the following problem :P
>
> In TAIR's probe-to-locus mapping file, for example
> ftp://ftp.arabidopsis.org/home/tair/
> Microarrays/Affymetrix/affy_ATH1_array_elements-2006-07-14.txt
>
> some probesets are mapped to >= 1 locus. However, in annotation packages
> ath1121501 and ag, all annotations (e.g. agCHRLOC, agENZYME) are indexed by
> probeset identifier. It assumes a one-to-one mapping between probeset and gene,
> so that the annotation to a gene is the annotation to a probeset.
>
> How to handle the one probeset to multiple locus mappings? I can think 3
> possible solutions:
> 1. pick the "best" locus, but how?
> 2. mix the annotations to all mapped locus together
> 3. set to NA
>
> Any suggestions are highly appreciated. Many thanks!
>
> nianhua
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
-+-+-+-+-+-+-+-+-+-+-+-
Björn Usadel, PhD
Max Planck Institute of Molecular Plant Physiology
System Regulation Group
Am Mühlenberg 1
D-14476 Golm
Germany
Tel (+49 331) 567-8114
Email usadel at mpimp-golm.mpg.de
WWW mapman.mpimp-golm.mpg.de
More information about the Bioconductor
mailing list