[BioC] xps - number of probesets for HuGene10ST array

cstrato cstrato at aon.at
Fri May 15 20:55:02 CEST 2009


Dear Michael

When you look at the annotation file 
"HuGene-1_0-st-v1.na27.hg18.transcript.csv" you will see that it 
contains in total 33297 probesets, which are divided as follows:

  28869  "main"
  57     "control->affx"
  45     "control->bgp->antigenomic"
  1195   "normgene->exon"
  2904   "normgene->intron"
  227    "rescue->FLmRNA->unmapped"

For further analysis, e.g. rma, only "main" and "affx" are used 
resulting in 28926 probesets, which is the number of probesets you mention.

Best regards
Christian
_._._._._._._._._._._._._._._._._._
C.h.r.i.s.t.i.a.n   S.t.r.a.t.o.w.a
V.i.e.n.n.a           A.u.s.t.r.i.a
e.m.a.i.l:        cstrato at aon.at
_._._._._._._._._._._._._._._._._._


Michael Walter wrote:
> Dear List,
>
> I have another question with the xps package. I installed root and xps, created the root schemes for the HuGene1.0 array and imported the first CEL files following the example scripts. Everything works fine to that point. However, after RMA normalization I'm missing a couple of probesets. When I use the Affymetrix expression console you get 33297 probesets. 4201 of these are +ve and -ve controls leaving 29096 probesets. With the xps package I have 28926 probesets with only 57 controls. Thus I'm missing 4371 and 227 probeset, respectively. I already checked the archives and found a similar question. However, the number of probesets i get is with exonlevel="all". I attached the code and session info below. My question now is: Where are the missing probesets, since I use the very same plg and clf files for xps and expression console?
>
> Thanks for your thoughts,
>
> Michael
>
>
> R code:
>
> library(xps)
>
> xpsdir = "C:/xps"
> scmdir=paste(xpsdir, "schemes", sep="/")
> libdir= paste(xpsdir, "library/HuGene10ST", sep="/")
> anndir= paste(xpsdir, "annotation", sep="/")
>
>
> scheme.hugene10stv1r4.na27 <- import.exon.scheme(
> 	"Scheme_HuGene10stv1r4_na27_2",
> 	filedir=scmdir,
> 	layoutfile=paste(libdir,"HuGene-1_0-st-v1.r4.clf",sep="/"),
> 	schemefile=paste(libdir,"HuGene-1_0-st-v1.r4.pgf",sep="/"),
> 	probeset=paste(anndir,"HuGene-1_0-st-v1.na27.2.hg18.probeset.csv",sep="/"),
>       transcript=paste(anndir,"HuGene-1_0-st-v1.na27.hg18.transcript.csv",sep="/"),
> 	verbose=TRUE)
>
> celdir <- getwd()
>
> scheme.HuGene10 <- root.scheme(paste("C:/xps/schemes","Scheme_HuGene10stv1r4_na27_2.root",sep="/"))
> data.test <- import.data(scheme.HuGene10, "DataTest", celdir=celdir)
> str(data.test)
>
> data.test2 <- attachMask(data.test)
> data.test2 <- attachInten(data.test)
>
> data.test2 <- removeInten(data.test2)
> data.test2 <- removeMask(data.test2)
>
> data.rma <- rma(data.test2, "tmpdt_Test2RMA", background="antigenomic", normalize=T, 
> 	exonlevel="all", verbose = FALSE)
>
> expr.rma <- validData(data.rma)
>
> call.dabg <- dabg.call(data.test2, "tmpdt_Test2DABG",exonlevel="all", verbose = FALSE)
>
> Session Info:
>
> R version 2.8.1 (2008-12-22) 
> i386-pc-mingw32 
>
> locale:
> LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base     
>
> other attached packages:
> [1] xps_1.2.6
>
>



More information about the Bioconductor mailing list