[BioC] athPkgBuilder data source :missing probesets

Justin Borevitz borevitz at uchicago.edu
Fri Aug 18 22:00:07 CEST 2006


Hi We have re-annotated the ath1 probes with the V6 annotation for
Arabidopsis.  You can find it here
http://naturalvariation.org/methods/ath1V6anno.RData in this probe setup
25mers with multiple gene matches are excluded..  We use probe level
modeling for gene expression estimates.  This is likely more than you wanted
but I thought I put it out there as my solution to the problem..

Justin Borevitz


Date: Thu, 17 Aug 2006 15:01:24 +0200
From: Bj?rn Usadel <usadel at mpimp-golm.mpg.de>
Subject: 		
To: Nianhua Li <nli at fhcrc.org>
Cc: bioconductor at stat.math.ethz.ch
Message-ID: <44E468A4.7050101 at mpimp-golm.mpg.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hi Nainhua,

as Tine pointed out I would suggest you choose multiple....

or you do it the hard way. For our purposes (MapMan visualization based 
on classification) I test, if the multiple genes hit by one probeset, 
have a similar function. If this is the case I mix the annotations 
assuming that it might be a [diverged] gene family, in which case there 
might be some information left (Affy used to tag them _s_, but affy is 
way outdated) when I sample the whole class.

However, if a probesets turns out to hit genes of different classes [non 
gene families ancient _x_ tag] (e.g. glycolysis and say proteasom 
dependent degradation)  I annotate the probeset as "hitting multi" and 
put it in a special "non-evaluate able" class.

You could also try to determine if it is really a gene family that is 
hit, in which case the annotations would be similar as well anyway.

But that is a lot of querying and eventually needs manual interaction.


Thanks for your work.

Cheers,
Bj?rn

Nianhua Li wrote:
> Hi, Tine, Bjorn, Thomas and other Arabidopsis experts,
> 
> Thanks a lot for the feedbacks. I will get the update done this week if
you
> could help me to solve the following problem :P
> 
> In TAIR's probe-to-locus mapping file, for example 
> ftp://ftp.arabidopsis.org/home/tair/
> Microarrays/Affymetrix/affy_ATH1_array_elements-2006-07-14.txt
> 
>  some probesets are mapped to >= 1 locus. However, in annotation packages

> ath1121501 and ag, all annotations (e.g. agCHRLOC, agENZYME) are indexed
by
> probeset identifier. It assumes a one-to-one mapping between probeset and
gene,
> so that the annotation to a gene is the annotation to a probeset. 
> 
> How to handle the one probeset to multiple locus mappings? I can think 3
> possible solutions:
> 1. pick the "best" locus, but how?
> 2. mix the annotations to all mapped locus together
> 3. set to NA
> 
> Any suggestions are highly appreciated. Many thanks!
> 
> nianhua
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
-+-+-+-+-+-+-+-+-+-+-+-
Bj?rn Usadel, PhD

Max Planck Institute of Molecular Plant Physiology
System Regulation Group

Am M?hlenberg 1
D-14476 Golm
Germany

Tel    (+49 331) 567-8114

Email  usadel at mpimp-golm.mpg.de
WWW    mapman.mpimp-golm.mpg.de



More information about the Bioconductor mailing list