[BioC] Can't normalize 300+ HuGene arrays in xps
Mike Walter
michael_walter at email.de
Mon Aug 30 13:23:44 CEST 2010
Dear Christian,
Thanks for your help. To answer your questions first: I normally use RGui and my disk space was ~100Gb. I also tried the add.data=FALSE option, without success.
So I did RMA normalization with 6 arrays in RTerm as you proposed. This worked fine. So I just tried to run RMA on all arrays on RTerm. Here, I got thousands of error messages after the "compution common mean" step was finished for all arrays. After approx. 20min of error messages scrolling over my screen windows ended R, so I couldn't copy any output. I made a screenshot, which is attached (although it might not make it into the BioC list).
Therefore, I tried the stepwise approach in RTerm. To my great surprise, now everything worked fine. There was no error when I started the quantile normlization with the same code as before (except the verbose=TRUE). The median polish afterwards also worked. The output of RTerm is pasted below.
So again, thank you very much for your help.
Kind regards,
Mike
> data.norm = normalize.quantiles(data.bkgd, filename = "quantile", filedir = $
+ tmpdir = "", update = FALSE, exonlevel = exonlevel, verbose = TRUE)
Opening file <X:/affy/QC_Scripts/xps/schemes/Scheme_HuGene10stv1r4_na30_hg19.roo
t> in <READ> mode...
Creating new file <F:/Auswertung/GENEPI_combined/quantile.root>...
Opening file <F:/Auswertung/GENEPI_combined/bkgd_correct.root> in <READ> mode...
Preprocessing data using method <preprocess>...
Normalizing raw data...
normalizing data using method <quantile>...
setting selector mask for typepm <9216>
finished filling <324> arrays.
computing common mean...
finished filling <324> trees.
preprocessing finished.
> save.image("F:/Auswertung/GENEPI_combined/GENEPI_all_stepwise.RData")
> data.mp = summarize.rma(data.norm, filename = "medianpolish", filedir = getw$
+ update = FALSE, option = "transcript", exonlevel = exonlevel, xps.scheme =$
Opening file <X:/affy/QC_Scripts/xps/schemes/Scheme_HuGene10stv1r4_na30_hg19.roo
t> in <READ> mode...
Creating new file <F:/Auswertung/GENEPI_combined/medianpolish.root>...
Opening file <F:/Auswertung/GENEPI_combined/quantile.root> in <READ> mode...
Preprocessing data using method <preprocess>...
Converting raw data to expression levels...
summarizing with <medianpolish>...
setting selector mask for typepm <9216>
setting selector mask for typepm <9216>
calculating expression for <28829> of <33664> units...Finished.
expression statistics:
minimal expression level is <3.11771>
maximal expression level is <20015.1>
preprocessing finished.
Opening file <X:/affy/QC_Scripts/xps/schemes/Scheme_HuGene10stv1r4_na30_hg19.roo
t> in <READ> mode...
Opening file <F:/Auswertung/GENEPI_combined/medianpolish.root> in <READ> mode...
Opening file <F:/Auswertung/GENEPI_combined/medianpolish.root> in <READ> mode...
Exporting data from tree <*> to file <F:/Auswertung/GENEPI_combined/medianpolish
.txt>...
Reading entries from <HuGene-1_0-st-v1.ann> ...Finished
<28829> of <28829> records exported.
-----Ursprüngliche Nachricht-----
Von: cstrato <cstrato at aon.at>
Gesendet: 27.08.2010 21:05:46
An: Mike Walter <michael_walter at email.de>
Betreff: Re: [BioC] Can't normalize 300+ HuGene arrays in xps
>Dear Mike,
>
>In case that your problem turns out to be a memory-related problem, you
>can use rma(...,add.data=FALSE,..), which will prevent filling slot
>"data" with the expression levels. You can then import all normalized
>data or parts thereof using "export.expr()" or "root.expr()", as the
>help files show.
>
>Thus you could first run rma and then import the results in a separate step:
>
>## rma
> > data.rma <- rma(data.xps, "tmpdt_dataRMA", background="antigenomic",
>normalize=T, exonlevel=exonlevel, add.data=FALSE, verbose = TRUE)
>
>## import subset of trees:
>ds <- export.expr(data.rma, treenames=c("name1.mdp","name3.mdp", etc),
>treetype="mdp", varlist="fUnitName:fSymbol:fLevel", outfile="tmp.txt",
>as.dataframe=TRUE)
>
>## use subset of trees
> > sub.rma <- root.expr(scheme.test3, "tmpdt_dataRMA.root", "mdp",
>c("name1.mdp", "name2", etc))
> > str(sub.rma)
>
>Maybe after starting a new R-session, you are able to import all trees
>with "treenames='*'".
>
>Please let me know if this could solve your problem.
>
>Best regards
>Christian
>
>
>On 8/27/10 3:35 PM, Mike Walter wrote:
>> Hi all,
>>
>> I have a set of 324 HuGene 1.0 arrays I'd like to normalize all in one batch on a "normal" Windows computer. I allready normalized the arrays in two sets of 180 and 144 samples successfully with xps. When I apply the code below to put the samples all together, my R session just crashes.
>>
>> library(xps)
>> memory.limit(size=3000) # I modyfied my boot.ini to allow more memory. At least I hope it works.
>> exonlevel=rep((8192+1024),3)
>> scheme="Scheme_HuGene10stv1r4_na30_hg19.root"
>> gene.scheme<- root.scheme(paste("X:/affy/QC_Scripts/xps/schemes",scheme,sep="/"))
>> data.xps = root.data(gene.scheme, paste(getwd(),"Genepi_all_cel.root",sep="/"))
>> data.rma<- rma(data.xps, "tmpdt_dataRMA", background="antigenomic", normalize=T,
>> exonlevel=exonlevel, verbose = FALSE)
>>
>>
>> Thus, I tried to do the RMA stepwise. I succeeded in background correction, but get some error when trying to do the quantile normalization:
>>
>> data.bkgd = bgcorrect.rma(data.xps, filename = "bkgd_correct",
>> filedir = getwd(), tmpdir = "", update = FALSE,
>> select = "antigenomic", exonlevel = exonlevel, verbose = FALSE)
>>
>> data.norm = normalize.quantiles(data.bkgd, filename = "quantile", filedir = getwd(),
>> tmpdir = "", update = FALSE, exonlevel = exonlevel, verbose = FALSE)
>>
>> OR
>>
>>
>> data.norm = normalize(data.bkgd, "quantile", filedir=getwd(), tmpdir="",
>> method="quantile", select="pmonly", option="transcript:together:none",
>> logbase="0", params=c(0.0), exonlevel=exonlevel)
>>
>>
>> in both cases the output is "Fehler in .local(object, ...) : error in function ‘Normalize’". I guess it is only a wrong option somewhere. I also tried exonlevel="metacore+affx" with same result. Can anyone give me a hint, what might be missing?
>>
>> Thank you very much.
>>
>> Best,
>> Mike
>>
>>> sessionInfo()
>> R version 2.10.1 (2009-12-14)
>> i386-pc-mingw32
>>
>> locale:
>> [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
>> [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
>> [5] LC_TIME=German_Germany.1252
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] xps_1.6.4
>>
>> loaded via a namespace (and not attached):
>> [1] tools_2.10.1
>>
>>
>>
>>
>>
More information about the Bioconductor
mailing list