[BioC] ragene10st

Manhong Dai daimh at umich.edu
Tue Mar 3 15:35:51 CET 2009


Hi Sebastien,


	Custom CDF version 11 is at
http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/CDF_download.asp#v11

	If you prefer entrez gene based cdf, it is at
http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/11.0.1/entrezg.asp then search RaGene10stv1 in the page.


	In custom CDF entrezg, the probeset id is already entrez gene. That's
why you saw the probeset ID in NUGO Custom CDF version 10 annotation
package is not the same as the probeset id in affy's original custom CDF
file.


Best,
Manhong

> Date: Tue, 03 Mar 2009 16:08:33 +1100
> From: Sebastien Gerega <seb at gerega.net>
> Subject: Re: [BioC] ragene10st
> To: bioconductor at stat.math.ethz.ch
> Message-ID: <49ACBB51.8070904 at gerega.net>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> Thank you Marc and Manhong for your suggestions.
> I have attempted both methods and run into some problems. Firstly, I was 
> able to build ragene10st.db using the following code:
> 
> source("http://bioconductor.org/biocLite.R")
> biocLite("rat.db0")
> 
> library(AnnotationDbi)
> fname = "RaGene-1_0-st-v1.EDITED.txt"
> wdir = getwd()   
> makeRATCHIP_DB(affy=FALSE,
>     prefix="ragene10st",
>     fileName=fname,
>     baseMapType="eg",
>     outputDir = wdir,
>     version="1.0.0",
>     manufacturer = "Affymetrix",
>     chipName = "Rat Gene ST Array",
>     manufacturerUrl = "http://www.affymetrix.com")
> 
> I then used this library for annotation of an analysis I performed. At 
> this point I realised that about one third of the 29171 probes were 
> assigned the gene symbol "RT1-C113". I realise this is due to the 
> annotation file used being in the wrong format. I had used the 
> "mrna_assignment" column which contains data appearing in a complex 
> format. Here are a couple examples:
> NM_001099458 // RefSeq // Rattus norvegicus similar to putative 
> pheromone receptor (RGD1564110), mRNA. // chr1 // 49 // 74 // 19 // 39 
> // 0 ///
> ENSRNOT00000046204 // Rn.217623 // ---
> NM_001099461 // Rn.217622 // --- /// NM_001099461 // Rn.217622 // --- 
> /// ENSRNOT00000041455 // Rn.217622 // --- /// ENSRNOT00000046204 // 
> Rn.217623 // ---
> 
> Unfortunately for the Gene ST chips there are no columns that simply 
> contain genbank, unigene, or refseq IDs.
> 
> So instead I tried Manhong's suggestion of using a custom CDF but there 
> is no custom CDF for rat gene ST arrays on the 
> http://brainarray.mbni.med.umich.edu/ website. However, if I follow the 
> link to http://nugo-r.bioinformatics.nl/NuGO_R.html I am able to locate 
> an appropriate CDF. Unfortunately, upon further examination of this CDF 
> package it appears as though the wrong probe IDs have been used.
> For example:
>  > as.list(ragene10stv1rnentrezgSYMBOL)[1:5]
> $`112400_at`
> [1] "Nrg1"
> 
> $`113882_at`
> [1] "Hemgn"
> 
> $`113886_at`
> [1] "Kif1c"
> 
> $`113892_at`
> [1] "Cml3"
> 
> As far as I am aware the probe IDs used for rat gene ST arrays are in 
> the following format (8 digits without "_at"):
> 10700001
> 10700003
> 10700004
> 10700005
> 10700013
> 
> Can anyone provide any advice for either of the two options?
> thanks,
> Sebastien



More information about the Bioconductor mailing list