[BioC] quantile robust and RMA in xps

Mayte Suarez-Farinas farinam at mail.rockefeller.edu
Mon May 25 18:06:42 CEST 2009


>>

Hi Christian.
Tx for your answer. For my first question, I am sorry but I am still  
confused, I dont know what the correct answer is. I am working with  
HuGene 1_0 ST, measuring expression, I though I had to used the  
common (default) RMA with PM's only. But it does not work. the option  
that works with "antigenomic" is using MM's. Then, is this option  
right for my case?
best,
Mayte

>> 1. In background correction:
>>
>> the default RMA background is:
>> data.bg.rma <- bgcorrect  
>> (G1ST_data2,"tmp_bg",method="rma",exonlevel="core+affx",   
>> select="none", option="pmonly:epanechnikov",params=c(16384))
>> but I got the following error:
>>
>> g.rma <- bgcorrect 
>> (G1ST_data2,"tmp_bg",method="rma",exonlevel="all",  select="none",  
>> 	option="pmonly:epanechnikov",params=c(16384))
>> Error in .local(object, ...) : error in function ‘BgCorrect’
>> Opening file </Users/Mayte/Rlibrary/AffyDB/ROOTSchemes/  
>> Scheme_HuGene10stv1r4_na28.root> in <READ> mode...
>> Creating new temporary file </Volumes/..../tmp_bg.root>...
>> Preprocessing data using method <adjustbgrd>...
>>     Background correcting raw data...
>>        calculating background for <1_HuGene 1_0 ST_050409.cel>...
>> Error: Number of PMs or MMs is zero.
>> An error has occured: Need to abort current process.
>>
>
> Please note that the default settings are always for expression  
> arrays, so the error tells you that there are no MMs.
>
>> So, I try:
>>
>> data.bg.rma <- bgcorrect  
>> (G1ST_data2,"tmp_bg2",method="rma",exonlevel="core+affx",   
>> select="antigenomic", option="pmonly:epanechnikov",params=c(16384))
>>
>> which works OK but I dont know if it is OK.
>>
>
>
> This is the correct setting for whole genome and exon arrays.  
> select="antigenomic" tells the program to use the antigenomic  
> background probes as MMs, e.g. if you use option "mmonly" instead  
> of "pmonly".
>




On May 23, 2009, at 11:42 AM, cstrato wrote:

> Dear Mayte,
>
> Although not recommended, this is in principle possible, however  
> your xps version is too old, you need version "xps_1.4.x", where I  
> have modified method "intensity()<-" for these purposes, see the  
> help file "?intensity".
>
> See my further comments below.
>
>
> Mayte Suarez-Farinas wrote:
>> Hi everybody.
>>
>> I am working with xps and I have to admit I still dont get all  
>> the  nuances, but I am trying my best.
>> To summarize the data, I want to use rma but with an alteration  
>> to  the normalization step.
>> so I need to do the 3 steps: bgcorrect, normalize and summarize.  
>> I  got two problems trying to do so:
>>
>> 1. In background correction:
>>
>> the default RMA background is:
>> data.bg.rma <- bgcorrect  
>> (G1ST_data2,"tmp_bg",method="rma",exonlevel="core+affx",   
>> select="none", option="pmonly:epanechnikov",params=c(16384))
>> but I got the following error:
>>
>> g.rma <- bgcorrect 
>> (G1ST_data2,"tmp_bg",method="rma",exonlevel="all",  select="none",  
>> 	option="pmonly:epanechnikov",params=c(16384))
>> Error in .local(object, ...) : error in function ‘BgCorrect’
>> Opening file </Users/Mayte/Rlibrary/AffyDB/ROOTSchemes/  
>> Scheme_HuGene10stv1r4_na28.root> in <READ> mode...
>> Creating new temporary file </Volumes/..../tmp_bg.root>...
>> Preprocessing data using method <adjustbgrd>...
>>     Background correcting raw data...
>>        calculating background for <1_HuGene 1_0 ST_050409.cel>...
>> Error: Number of PMs or MMs is zero.
>> An error has occured: Need to abort current process.
>>
>
> Please note that the default settings are always for expression  
> arrays, so the error tells you that there are no MMs.
>
>> So, I try:
>>
>> data.bg.rma <- bgcorrect  
>> (G1ST_data2,"tmp_bg2",method="rma",exonlevel="core+affx",   
>> select="antigenomic", option="pmonly:epanechnikov",params=c(16384))
>>
>> which works OK but I dont know if it is OK.
>>
>
>
> This is the correct setting for whole genome and exon arrays.  
> select="antigenomic" tells the program to use the antigenomic  
> background probes as MMs, e.g. if you use option "mmonly" instead  
> of "pmonly".
>
>
>> After that I want to use normalize.quantiles.robust function from   
>> affy (is not available in xps)
>> so I did:
>>
>> data.bg.rma<-attachInten(data.bg.rma)
>> data.int<-intensity(data.bg.rma)
>> detach(package:xps)
>> library(affy)
>> data.int.norm<-normalize.quantiles.robust(as.matrix(data.int[,-c  
>> (1,2)]),n.remove=5,remove.extreme='both')
>>
>
> In version R-2.9.0 which I am using, this function has moved to  
> package "preprocessCore" but it seems not to work:
>
> library(preprocessCore)
> data.int.norm <- normalize.quantiles.robust(as.matrix(data.int[,-c 
> (1,2)]), n.remove=1, remove.extreme='both')
>
> I get the following error message:
> Error in normalize.quantiles.robust(as.matrix(data.int[, -c(1,  
> 2)]), n.remove = 1, :
> VECTOR_ELT() can only be applied to a 'list', not a 'character
>
> Thus to simulate your setting I use function "normalize.quantiles"  
> and delete one sample by hand:
>
> data.int.norm <- normalize.quantiles(as.matrix(data.int[,-c(1,2)]))
> data.int.norm <- data.int.norm[,-4]
> colnames(data.int.norm) <- c 
> ("Breast01","Breast02","Breast03","Prostate02","Prostate03")
>
> Note that (at least for me) the output is a matrix w/o column  
> names, thus you need to set the correct column names manually.
> (In my example I am using the breast/prostate triplicates from the  
> Affy dataset.)
>
>
>> which shows that the data is normalized. Then I have to update  
>> the  intensitities in the xps object data.bg.rma,
>> which I did and after
>>
>> library(xps)
>> str(data.int)
>> data.int[,-c(1,2)]<-data.int.norm
>> intensity(data.bg.rma)<-data.int
>> boxplot(data.bg.rma)              #boxplot is OK
>>
>
> The new replacement method "intensity()<-" has an option to create  
> a new ROOT file (see?intensity), thus you need to do:
>
> library(xps)
> str(data.int)
>
> data.int.norm <- as.data.frame(cbind(data.int[,c(1,2)],data.int.norm))
>
> Here you see that I added the (x,y) coordinates, but it is up to  
> you to make sure that the order is correct.
> I am using cbind() to prevent cycling of the samples, which is what  
> I get when using "data.int[,-c(1,2)]".
>
> Now I can use the replacement method:
>
> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm
> str(data.bg.rma)
> boxplot(data.bg.rma) #boxplot is OK
>
> Please note that this will take some time since the background- 
> corrected intensities will first be saved as CEL-files which are  
> then imported into the new ROOT file "tmp_int2_cel.root".
>
>
>> The problem comes when I sumarized the resulting data using  
>> median  polish,
>> the resulting data is not normalized:
>>
>> data.mp.rma <- summarize.rma 
>> (data.bg.rma,"tmp_sum_rma",exonlevel="core +affx")
>> boxplot(data.mp.rma)    #boxplot is not OK.
>>
>
> Now you can summarize the data using xps, but you need to replace  
> the setname first:
>
> setName(data.bg.rma) <- "DataSet"
> data.mp.rma <- summarize.rma(data.bg.rma, "tmp_sum_rma",  
> exonlevel="core+affx")
> boxplot(data.mp.rma) #boxplot is now OK.
>
> I hope this helps.
> Best regards
> Christian
>
>
>> I dont know if I make a mistake specially in updating the  
>> intensities  after the normalization step. I will really  
>> appreciate any insight on  this. Below is my session info...
>>
>>
>>  > sessionInfo()
>> R version 2.8.1 (2008-12-22)
>> i386-apple-darwin8.11.1
>>
>> locale:
>> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>
>> attached base packages:
>>   [1] grid      splines   tools     stats     graphics  grDevices   
>> utils     datasets  methods   base
>>
>> other attached packages:
>>   [1] xps_1.2.10                affy_1.20.2                 
>> arrayQualityMetrics_1.8.1 marray_1.20.0               
>> latticeExtra_0.5-4        vsn_3.8.0
>>   [7] beadarray_1.10.0          sma_0.5.15                  
>> hwriter_1.0               affycoretools_1.14.1        
>> annaffy_1.14.0            KEGG.db_2.2.5
>> [13] biomaRt_1.16.0            GOstats_2.8.0               
>> Category_2.8.4            RBGL_1.18.0                 
>> GO.db_2.2.5               RSQLite_0.7-1
>> [19] DBI_0.2-4                 graph_1.20.0                
>> limma_2.16.5              affyQCReport_1.20.0         
>> geneplotter_1.20.0        annotate_1.20.1
>> [25] AnnotationDbi_1.5.18      lattice_0.17-17             
>> RColorBrewer_1.0-2        affyPLM_1.18.1              
>> preprocessCore_1.4.0      xtable_1.5-4
>> [31] simpleaffy_2.18.0         gcrma_2.14.1                
>> matchprobes_1.14.1        genefilter_1.22.0           
>> survival_2.34-1           Biobase_2.2.2
>>
>> loaded via a namespace (and not attached):
>> [1] GSEABase_1.4.0     KernSmooth_2.22-22 RCurl_0.94-1         
>> XML_2.1-0          affyio_1.10.1      cluster_1.11.11
>>
>>
>>
>> 	[[alternative HTML version deleted]]
>>
>>    
>> --------------------------------------------------------------------- 
>> ---
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/ 
>> gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list