[BioC] Can't normalize 300+ HuGene arrays in xps

cstrato cstrato at aon.at
Mon Aug 30 21:37:44 CEST 2010


Dear Mike,

First, I am glad to hear that the stepwise approach did finally work.

Thank you also for sending me the screenshot which repeats the following 
message many times:

This error is symptomatic of a Tree created as a memory-resident Tree
Instead of doing:
    TTree *T = new TTree(...);
    TFile *f = new TFile(...);
you should do:
    TFile *f = new TFile(...);
    TTree *T = new TTree(...);

Since I create always TFile first before creating new TTree(s) this 
means that for some reason the connection to TFile got lost so that the 
trees are kept in RAM. If you have only 6 trees this is no problem but 
with 324 trees you get this error message. Sadly, the beginning of the 
error messages are lost so that I do not know whether TFile was created 
or not.

Thus, at the moment I have no idea what might be the reason for this 
problem and until now this error has never been reported.

I would really appreciate if you could you try to run rma() with 
'filename = "dataRMA"' instead of 'filename = "tmpdt_dataRMA"' and let 
me know if the problem remains.

Best regards
Christian


On 8/30/10 1:23 PM, Mike Walter wrote:
> Dear Christian,
>
> Thanks for your help. To answer your questions first: I normally use RGui and my disk space was ~100Gb. I also tried the add.data=FALSE option, without success.
>
> So I did RMA normalization with 6 arrays in RTerm as you proposed. This worked fine. So I just tried to run RMA on all arrays on RTerm. Here, I got thousands of error messages after the "compution common mean" step was finished for all arrays. After approx. 20min of error messages scrolling over my screen windows ended R, so I couldn't copy any output. I made a screenshot, which is attached (although it might not make it into the BioC list).
>
> Therefore, I tried the stepwise approach in RTerm. To my great surprise, now everything worked fine. There was no error when I started the quantile normlization with the same code as before (except the verbose=TRUE). The median polish afterwards also worked. The output of RTerm is pasted below.
>
> So again, thank you very much for your help.
>
> Kind regards,
>
> Mike
>
>
>> data.norm = normalize.quantiles(data.bkgd, filename = "quantile", filedir = $
> + tmpdir = "", update = FALSE, exonlevel = exonlevel, verbose = TRUE)
> Opening file<X:/affy/QC_Scripts/xps/schemes/Scheme_HuGene10stv1r4_na30_hg19.roo
> t>  in<READ>  mode...
> Creating new file<F:/Auswertung/GENEPI_combined/quantile.root>...
> Opening file<F:/Auswertung/GENEPI_combined/bkgd_correct.root>  in<READ>  mode...
>
> Preprocessing data using method<preprocess>...
>   Normalizing raw data...
>   normalizing data using method<quantile>...
>   setting selector mask for typepm<9216>
>   finished filling<324>  arrays.
>   computing common mean...
>   finished filling<324>  trees.
>   preprocessing finished.
>> save.image("F:/Auswertung/GENEPI_combined/GENEPI_all_stepwise.RData")
>> data.mp = summarize.rma(data.norm, filename = "medianpolish", filedir = getw$
> +   update = FALSE, option = "transcript", exonlevel = exonlevel, xps.scheme =$
> Opening file<X:/affy/QC_Scripts/xps/schemes/Scheme_HuGene10stv1r4_na30_hg19.roo
> t>  in<READ>  mode...
> Creating new file<F:/Auswertung/GENEPI_combined/medianpolish.root>...
> Opening file<F:/Auswertung/GENEPI_combined/quantile.root>  in<READ>  mode...
> Preprocessing data using method<preprocess>...
>   Converting raw data to expression levels...
>   summarizing with<medianpolish>...
>   setting selector mask for typepm<9216>
>   setting selector mask for typepm<9216>
>   calculating expression for<28829>  of<33664>  units...Finished.
>   expression statistics:
>   minimal expression level is<3.11771>
>   maximal expression level is<20015.1>
>   preprocessing finished.
> Opening file<X:/affy/QC_Scripts/xps/schemes/Scheme_HuGene10stv1r4_na30_hg19.roo
> t>  in<READ>  mode...
> Opening file<F:/Auswertung/GENEPI_combined/medianpolish.root>  in<READ>  mode...
>
> Opening file<F:/Auswertung/GENEPI_combined/medianpolish.root>  in<READ>  mode...
>
> Exporting data from tree<*>  to file<F:/Auswertung/GENEPI_combined/medianpolish
> .txt>...
> Reading entries from<HuGene-1_0-st-v1.ann>  ...Finished
> <28829>  of<28829>  records exported.
>
>
>
> -----Ursprüngliche Nachricht-----
> Von: cstrato<cstrato at aon.at>
> Gesendet: 27.08.2010 21:05:46
> An: Mike Walter<michael_walter at email.de>
> Betreff: Re: [BioC] Can't normalize 300+ HuGene arrays in xps
>
>> Dear Mike,
>>
>> In case that your problem turns out to be a memory-related problem, you
>> can use rma(...,add.data=FALSE,..), which will prevent filling slot
>> "data" with the expression levels. You can then import all normalized
>> data or parts thereof using "export.expr()" or "root.expr()", as the
>> help files show.
>>
>> Thus you could first run rma and then import the results in a separate step:
>>
>> ## rma
>>> data.rma<- rma(data.xps, "tmpdt_dataRMA", background="antigenomic",
>> normalize=T, exonlevel=exonlevel,  add.data=FALSE, verbose = TRUE)
>>
>> ## import subset of trees:
>> ds<- export.expr(data.rma, treenames=c("name1.mdp","name3.mdp", etc),
>> treetype="mdp", varlist="fUnitName:fSymbol:fLevel", outfile="tmp.txt",
>> as.dataframe=TRUE)
>>
>> ## use subset of trees
>>> sub.rma<- root.expr(scheme.test3, "tmpdt_dataRMA.root", "mdp",
>> c("name1.mdp", "name2", etc))
>>> str(sub.rma)
>>
>> Maybe after starting a new R-session, you are able to import all trees
>> with "treenames='*'".
>>
>> Please let me know if this could solve your problem.
>>
>> Best regards
>> Christian
>>
>>
>> On 8/27/10 3:35 PM, Mike Walter wrote:
>>> Hi all,
>>>
>>> I have a set of 324 HuGene 1.0 arrays I'd like to normalize all in one batch on a "normal" Windows computer. I allready normalized the arrays in two sets of 180 and 144 samples successfully with xps. When I apply the code below to put the samples all together, my R session just crashes.
>>>
>>> library(xps)
>>> memory.limit(size=3000) # I modyfied my boot.ini to allow more memory. At least I hope it works.
>>> exonlevel=rep((8192+1024),3)
>>> scheme="Scheme_HuGene10stv1r4_na30_hg19.root"
>>> gene.scheme<- root.scheme(paste("X:/affy/QC_Scripts/xps/schemes",scheme,sep="/"))
>>> data.xps = root.data(gene.scheme, paste(getwd(),"Genepi_all_cel.root",sep="/"))
>>> data.rma<- rma(data.xps, "tmpdt_dataRMA", background="antigenomic", normalize=T,
>>>                       exonlevel=exonlevel, verbose = FALSE)
>>>
>>>
>>> Thus, I tried to do the RMA stepwise. I succeeded in background correction, but get some error when trying to do the quantile normalization:
>>>
>>> data.bkgd = bgcorrect.rma(data.xps, filename = "bkgd_correct",
>>>                       filedir = getwd(), tmpdir = "", update = FALSE,
>>>                       select = "antigenomic", exonlevel = exonlevel, verbose = FALSE)
>>>
>>> data.norm = normalize.quantiles(data.bkgd, filename = "quantile", filedir = getwd(),
>>>                        tmpdir = "", update = FALSE, exonlevel = exonlevel, verbose = FALSE)
>>>
>>> OR
>>>
>>>
>>> data.norm = normalize(data.bkgd, "quantile", filedir=getwd(), tmpdir="",
>>>                       method="quantile", select="pmonly", option="transcript:together:none",
>>>                       logbase="0", params=c(0.0), exonlevel=exonlevel)
>>>
>>>
>>> in both cases the output is "Fehler in .local(object, ...) : error in function ‘Normalize’". I guess it is only a wrong option somewhere. I also tried exonlevel="metacore+affx" with same result. Can anyone give me a hint, what might be missing?
>>>
>>> Thank you very much.
>>>
>>> Best,
>>> Mike
>>>
>>>> sessionInfo()
>>> R version 2.10.1 (2009-12-14)
>>> i386-pc-mingw32
>>>
>>> locale:
>>> [1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252
>>> [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
>>> [5] LC_TIME=German_Germany.1252
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>
>>> other attached packages:
>>> [1] xps_1.6.4
>>>
>>> loaded via a namespace (and not attached):
>>> [1] tools_2.10.1
>>>
>>>
>>>
>>>
>> >



More information about the Bioconductor mailing list