[BioC] Analysis of Affymetrix Mouse Gene 2.0 ST arrays
Naxerova, Kamila
naxerova at fas.harvard.edu
Thu Mar 7 00:16:19 CET 2013
Dear Christian and Jim,
many thanks to both of you for your explanations.
Your hard work paid off, and I have finally understood everything and managed to build my annotation package!!!! I wrote a little script similar to what Jim was suggesting, namely picking the first RefSeq-like thing I came across. Jim called it "naive" -- but I think there is no downside to this approach, right? I have looked at various examples in the Affy file for a long time, and simply picking the first Refseq ID seems to be kosher.
data <-read.csv("MoGene-transcript-noheader.csv",header=T,stringsAsFactors=F,sep=",")
sdata <- data[,c(1,9)]
returnRef=function(x){
refst <- strsplit(x,split="///")[[1]][grep("RefSeq",strsplit(x,split="///")[[1]])[1]]
refid <- gsub(" ","",strsplit(refst,split="//")[[1]][1])
return(refid)
}
sdata$refseqids <- sapply(sdata[,2],returnRef)
fdata <- sdata[,-2]
write.table(fdata,"AnnotBuild.txt", sep="\t",quote=F,row.names=F,col.names=F)
library(AnnotationForge)
library(mouse.db0)
library(org.Mm.eg.db)
makeDBPackage("MOUSECHIP_DB",
affy=F,
prefix="mogene20sttranscriptcluster",
fileName="AnnotBuild.txt",
outputDir = ".",
version="2.11.1",
baseMapType="refseq",
manufacturer = "Affymetrix",
chipName = "Mouse Gene 2.0 ST Array",
manufacturerUrl = "http://www.affymetrix.com",
author = "Kamila Naxerova",
maintainer = "Kamila Naxerova <naxerova at fas.harvard.edu>")
> install.packages("mogene20sttranscriptcluster.db",repos=NULL, type="source")
* installing *source* package ‘mogene20sttranscriptcluster.db’ ...
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
*** arch - i386
*** arch - x86_64
* DONE (mogene20sttranscriptcluster.db)
More information about the Bioconductor
mailing list