[BioC] xps - number of probesets for HuGene10ST array

cstrato cstrato at aon.at
Tue May 19 21:50:51 CEST 2009


Dear Michael

If you really want to compute unmapped FLmRNAs and pos/neg controls 
together with "core+affx" probesets, you can do:

data.rma <- rma(data.genome, "HuGeneMixRMA926780", filedir=datdir, 
tmpdir="", background="antigenomic", normalize=T, 
exonlevel=c(926780,926780,926780))
export.expr(data.rma, outfile="HuGeneMixRMA926780.txt")

Until now this was an undocumented feature since I do not recommend to 
use these probesets, because in my opinion they will only add noise to 
the "core" probesets. However, in the new version "xps_1.4.3" I have 
updated the help "?exonLevel", which shows you which numbers to add for 
the different combinations.

In principle, it is now also possible to use all 33297 probesets, 
including the antigenomic probesets by setting 
"exonlevel=c(992316,992316,992316)". However, in this case you need to 
update to "xps_1.4.3", otherwise you will get a bus error.

Best regards
Christian


Michael Walter wrote:
> Dear Christian,
>
> Thanks for your reply. That makes things clear. I was missing the unmapped FLmRNAs. However, is there a way to get also the summarized signals for the control probes? Affy is recommending ROCs between positive and negative controls for QC and I'd like to calculate this directly in Bioconductor. 
>
> Thanks again,
>
> Michael
>
>   
>> -----Ursprüngliche Nachricht-----
>> Von: "cstrato" <cstrato at aon.at>
>> Gesendet: 15.05.09 20:57:05
>> An: Michael Walter <michael.walter at med.uni-tuebingen.de>
>> CC: bioconductor at stat.math.ethz.ch
>> Betreff: Re: [BioC] xps - number of probesets for HuGene10ST array
>>     
>
>
>   
>> Dear Michael
>>
>> When you look at the annotation file 
>> "HuGene-1_0-st-v1.na27.hg18.transcript.csv" you will see that it 
>> contains in total 33297 probesets, which are divided as follows:
>>
>>   28869  "main"
>>   57     "control->affx"
>>   45     "control->bgp->antigenomic"
>>   1195   "normgene->exon"
>>   2904   "normgene->intron"
>>   227    "rescue->FLmRNA->unmapped"
>>
>> For further analysis, e.g. rma, only "main" and "affx" are used 
>> resulting in 28926 probesets, which is the number of probesets you mention.
>>
>> Best regards
>> Christian
>> _._._._._._._._._._._._._._._._._._
>> C.h.r.i.s.t.i.a.n   S.t.r.a.t.o.w.a
>> V.i.e.n.n.a           A.u.s.t.r.i.a
>> e.m.a.i.l:        cstrato at aon.at
>> _._._._._._._._._._._._._._._._._._
>>
>>
>> Michael Walter wrote:
>>     
>>> Dear List,
>>>
>>> I have another question with the xps package. I installed root and xps, created the root schemes for the HuGene1.0 array and imported the first CEL files following the example scripts. Everything works fine to that point. However, after RMA normalization I'm missing a couple of probesets. When I use the Affymetrix expression console you get 33297 probesets. 4201 of these are +ve and -ve controls leaving 29096 probesets. With the xps package I have 28926 probesets with only 57 controls. Thus I'm missing 4371 and 227 probeset, respectively. I already checked the archives and found a similar question. However, the number of probesets i get is with exonlevel="all". I attached the code and session info below. My question now is: Where are the missing probesets, since I use the very same plg and clf files for xps and expression console?
>>>
>>> Thanks for your thoughts,
>>>
>>> Michael
>>>
>>>
>>> R code:
>>>
>>> library(xps)
>>>
>>> xpsdir = "C:/xps"
>>> scmdir=paste(xpsdir, "schemes", sep="/")
>>> libdir= paste(xpsdir, "library/HuGene10ST", sep="/")
>>> anndir= paste(xpsdir, "annotation", sep="/")
>>>
>>>
>>> scheme.hugene10stv1r4.na27 <- import.exon.scheme(
>>> 	"Scheme_HuGene10stv1r4_na27_2",
>>> 	filedir=scmdir,
>>> 	layoutfile=paste(libdir,"HuGene-1_0-st-v1.r4.clf",sep="/"),
>>> 	schemefile=paste(libdir,"HuGene-1_0-st-v1.r4.pgf",sep="/"),
>>> 	probeset=paste(anndir,"HuGene-1_0-st-v1.na27.2.hg18.probeset.csv",sep="/"),
>>>       transcript=paste(anndir,"HuGene-1_0-st-v1.na27.hg18.transcript.csv",sep="/"),
>>> 	verbose=TRUE)
>>>
>>> celdir <- getwd()
>>>
>>> scheme.HuGene10 <- root.scheme(paste("C:/xps/schemes","Scheme_HuGene10stv1r4_na27_2.root",sep="/"))
>>> data.test <- import.data(scheme.HuGene10, "DataTest", celdir=celdir)
>>> str(data.test)
>>>
>>> data.test2 <- attachMask(data.test)
>>> data.test2 <- attachInten(data.test)
>>>
>>> data.test2 <- removeInten(data.test2)
>>> data.test2 <- removeMask(data.test2)
>>>
>>> data.rma <- rma(data.test2, "tmpdt_Test2RMA", background="antigenomic", normalize=T, 
>>> 	exonlevel="all", verbose = FALSE)
>>>
>>> expr.rma <- validData(data.rma)
>>>
>>> call.dabg <- dabg.call(data.test2, "tmpdt_Test2DABG",exonlevel="all", verbose = FALSE)
>>>
>>> Session Info:
>>>
>>> R version 2.8.1 (2008-12-22) 
>>> i386-pc-mingw32 
>>>
>>> locale:
>>> LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base     
>>>
>>> other attached packages:
>>> [1] xps_1.2.6
>>>
>>>
>>>       
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>     
>
>



More information about the Bioconductor mailing list