[BioC] Note on XPS for MoGene and RaGene arrays

Sat Aug 8 17:01:35 CEST 2009

As many users know xps supports the whole genome arrays HuGene, MoGene 
and RaGene, both release r3 and release r4. While xps works with HuGene 
arrays as expected some users have currently reported problems with 
MoGene arrays. The reason is that Affymetrix has just updated the 
PGF-files for the MoGene and RaGene arrays by deleting an AFFX control 
probeset which was causing an error in APT when using the option "-a 
dabg" (see the corresponding README files). However, at the moment 
Affymetrix has not deleted this probeset from the corresponding 
transcript and probset annotation files, which causes an error in xps 
since it checks for consistency between these files.

Affymetrix has already promised to update the annotation files in the 
next release of the files. Meanwhile, I need to ask the users of xps to 
use the following function to update the annotation files for the MoGene 
and RaGene arrays:

"updateAnnotation" <- function(infile, outfile, probeset, skip, eol="\n") {
   ## read header and probesets
   cat("reading", infile, "...\n");
   header <- readLines(infile, n=skip);
   annot  <- read.csv(infile, colClasses="character", comment.char="", 
skip=skip);

   ## delete probeset
   line  <- which(annot[,"probeset_id"] == probeset);
   if (length(line) > 0) {
      cat("deleting line", line, "for probeset", probeset, "...\n");
      annot <- annot[-line,];
   }#if

   ## write header and append probesets
   ## (use binary file to prevent conversion of LF to CRLF on WinXP)
   cat("writing", outfile, "...\n");
   file <- file(outfile, "wb")
   writeLines(header, con=file, sep=eol);
   write.table(annot, file=file, append=TRUE, sep=",", eol=eol, 
row.names=FALSE);
   close(file)
}#updateAnnotation

For MoGene arrays please use the following code to update the annotation 
files:

# probeset annotation
updateAnnotation("MoGene-1_0-st-v1.na29.mm9.probeset.csv", 
"MoGene-1_0-st-v1.na29.mm9.probeset.fixed.csv", probeset="10338063", 
skip=18, eol="\n")
# transcript annotation
updateAnnotation("MoGene-1_0-st-v1.na29.mm9.transcript.csv", 
"MoGene-1_0-st-v1.na29.mm9.transcript.fixed.csv", probeset="10338063", 
skip=19, eol="\n")

You can then create the scheme for MoGene as follows:

scheme.mogene10stv1r4.na29 <- 
import.exon.scheme("Scheme_MoGene10stv1r4_na29",filedir=scmdir,
paste(libdir,"MoGene-1_0-st-v1.r4.analysis-lib-files/MoGene-1_0-st-v1.r4.clf",sep="/"), 

paste(libdir,"MoGene-1_0-st-v1.r4.analysis-lib-files/MoGene-1_0-st-v1.r4.pgf",sep="/"), 

paste(anndir,"MoGene-1_0-st-v1.na29.mm9.probeset.fixed.csv",sep="/"),
paste(anndir,"MoGene-1_0-st-v1.na29.mm9.transcript.fixed.csv",sep="/"))

For RaGene arrays please use the following code to update the annotation 
files:

# probeset annotation
updateAnnotation("RaGene-1_0-st-v1.na29.rn4.probeset.csv", 
"RaGene-1_0-st-v1.na29.rn4.probeset.fixed.csv", probeset="10700063", 
skip=18, eol="\n")
# transcript annotation
updateAnnotation("RaGene-1_0-st-v1.na29.rn4.transcript.csv", 
"RaGene-1_0-st-v1.na29.rn4.transcript.fixed.csv", probeset="10700063", 
skip=19, eol="\n")

You can then create the scheme for RaGene as follows:

scheme.ragene10stv1r4.na29 <- 
import.exon.scheme("Scheme_RaGene10stv1r4_na29",filedir=scmdir,
paste(libdir,"RaGene-1_0-st-v1.r4.analysis-lib-files/RaGene-1_0-st-v1.r4.clf",sep="/"), 

paste(libdir,"RaGene-1_0-st-v1.r4.analysis-lib-files/RaGene-1_0-st-v1.r4.pgf",sep="/"), 

paste(anndir,"RaGene-1_0-st-v1.na29.rn4.probeset.fixed.csv",sep="/"),
paste(anndir,"RaGene-1_0-st-v1.na29.rn4.transcript.fixed.csv",sep="/"))

Please feel free to ask me if you have any further questions.

Best regards
Christian
_._._._._._._._._._._._._._._._._._
C.h.r.i.s.t.i.a.n   S.t.r.a.t.o.w.a
V.i.e.n.n.a           A.u.s.t.r.i.a
e.m.a.i.l:        cstrato at aon.at
_._._._._._._._._._._._._._._._._._