[BioC] quantile robust and RMA in xps

cstrato cstrato at aon.at
Sat May 23 17:42:46 CEST 2009


Dear Mayte,

Although not recommended, this is in principle possible, however your 
xps version is too old, you need version "xps_1.4.x", where I have 
modified method "intensity()<-" for these purposes, see the help file 
"?intensity".

See my further comments below.


Mayte Suarez-Farinas wrote:
> Hi everybody.
>
> I am working with xps and I have to admit I still dont get all the  
> nuances, but I am trying my best.
> To summarize the data, I want to use rma but with an alteration to  
> the normalization step.
> so I need to do the 3 steps: bgcorrect, normalize and summarize. I  
> got two problems trying to do so:
>
> 1. In background correction:
>
> the default RMA background is:
> data.bg.rma <- bgcorrect 
> (G1ST_data2,"tmp_bg",method="rma",exonlevel="core+affx",  
> select="none", option="pmonly:epanechnikov",params=c(16384))
> but I got the following error:
>
> g.rma <- bgcorrect(G1ST_data2,"tmp_bg",method="rma",exonlevel="all",  
> select="none", 	option="pmonly:epanechnikov",params=c(16384))
> Error in .local(object, ...) : error in function ‘BgCorrect’
> Opening file </Users/Mayte/Rlibrary/AffyDB/ROOTSchemes/ 
> Scheme_HuGene10stv1r4_na28.root> in <READ> mode...
> Creating new temporary file </Volumes/..../tmp_bg.root>...
> Preprocessing data using method <adjustbgrd>...
>     Background correcting raw data...
>        calculating background for <1_HuGene 1_0 ST_050409.cel>...
> Error: Number of PMs or MMs is zero.
> An error has occured: Need to abort current process.
>   

Please note that the default settings are always for expression arrays, 
so the error tells you that there are no MMs.

> So, I try:
>
> data.bg.rma <- bgcorrect 
> (G1ST_data2,"tmp_bg2",method="rma",exonlevel="core+affx",  
> select="antigenomic", option="pmonly:epanechnikov",params=c(16384))
>
> which works OK but I dont know if it is OK.
>   


This is the correct setting for whole genome and exon arrays. 
select="antigenomic" tells the program to use the antigenomic background 
probes as MMs, e.g. if you use option "mmonly" instead of "pmonly".


> After that I want to use normalize.quantiles.robust function from  
> affy (is not available in xps)
> so I did:
>
> data.bg.rma<-attachInten(data.bg.rma)
> data.int<-intensity(data.bg.rma)
> detach(package:xps)
> library(affy)
> data.int.norm<-normalize.quantiles.robust(as.matrix(data.int[,-c 
> (1,2)]),n.remove=5,remove.extreme='both')
>   

In version R-2.9.0 which I am using, this function has moved to package 
"preprocessCore" but it seems not to work:

library(preprocessCore)
data.int.norm <- 
normalize.quantiles.robust(as.matrix(data.int[,-c(1,2)]), n.remove=1, 
remove.extreme='both')

I get the following error message:
Error in normalize.quantiles.robust(as.matrix(data.int[, -c(1, 2)]), 
n.remove = 1, :
VECTOR_ELT() can only be applied to a 'list', not a 'character

Thus to simulate your setting I use function "normalize.quantiles" and 
delete one sample by hand:

data.int.norm <- normalize.quantiles(as.matrix(data.int[,-c(1,2)]))
data.int.norm <- data.int.norm[,-4]
colnames(data.int.norm) <- 
c("Breast01","Breast02","Breast03","Prostate02","Prostate03")

Note that (at least for me) the output is a matrix w/o column names, 
thus you need to set the correct column names manually.
(In my example I am using the breast/prostate triplicates from the Affy 
dataset.)


> which shows that the data is normalized. Then I have to update the  
> intensitities in the xps object data.bg.rma,
> which I did and after
>
> library(xps)
> str(data.int)
> data.int[,-c(1,2)]<-data.int.norm
> intensity(data.bg.rma)<-data.int
> boxplot(data.bg.rma)              #boxplot is OK
>   

The new replacement method "intensity()<-" has an option to create a new 
ROOT file (see?intensity), thus you need to do:

library(xps)
str(data.int)

data.int.norm <- as.data.frame(cbind(data.int[,c(1,2)],data.int.norm))

Here you see that I added the (x,y) coordinates, but it is up to you to 
make sure that the order is correct.
I am using cbind() to prevent cycling of the samples, which is what I 
get when using "data.int[,-c(1,2)]".

Now I can use the replacement method:

intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm
str(data.bg.rma)
boxplot(data.bg.rma) #boxplot is OK

Please note that this will take some time since the background-corrected 
intensities will first be saved as CEL-files which are then imported 
into the new ROOT file "tmp_int2_cel.root".


> The problem comes when I sumarized the resulting data using median  
> polish,
> the resulting data is not normalized:
>
> data.mp.rma <- summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core 
> +affx")
> boxplot(data.mp.rma)    #boxplot is not OK.
>   

Now you can summarize the data using xps, but you need to replace the 
setname first:

setName(data.bg.rma) <- "DataSet"
data.mp.rma <- summarize.rma(data.bg.rma, "tmp_sum_rma", 
exonlevel="core+affx")
boxplot(data.mp.rma) #boxplot is now OK.

I hope this helps.
Best regards
Christian


> I dont know if I make a mistake specially in updating the intensities  
> after the normalization step. I will really appreciate any insight on  
> this. Below is my session info...
>
>
>  > sessionInfo()
> R version 2.8.1 (2008-12-22)
> i386-apple-darwin8.11.1
>
> locale:
> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
>   [1] grid      splines   tools     stats     graphics  grDevices  
> utils     datasets  methods   base
>
> other attached packages:
>   [1] xps_1.2.10                affy_1.20.2                
> arrayQualityMetrics_1.8.1 marray_1.20.0              
> latticeExtra_0.5-4        vsn_3.8.0
>   [7] beadarray_1.10.0          sma_0.5.15                 
> hwriter_1.0               affycoretools_1.14.1       
> annaffy_1.14.0            KEGG.db_2.2.5
> [13] biomaRt_1.16.0            GOstats_2.8.0              
> Category_2.8.4            RBGL_1.18.0                
> GO.db_2.2.5               RSQLite_0.7-1
> [19] DBI_0.2-4                 graph_1.20.0               
> limma_2.16.5              affyQCReport_1.20.0        
> geneplotter_1.20.0        annotate_1.20.1
> [25] AnnotationDbi_1.5.18      lattice_0.17-17            
> RColorBrewer_1.0-2        affyPLM_1.18.1             
> preprocessCore_1.4.0      xtable_1.5-4
> [31] simpleaffy_2.18.0         gcrma_2.14.1               
> matchprobes_1.14.1        genefilter_1.22.0          
> survival_2.34-1           Biobase_2.2.2
>
> loaded via a namespace (and not attached):
> [1] GSEABase_1.4.0     KernSmooth_2.22-22 RCurl_0.94-1        
> XML_2.1-0          affyio_1.10.1      cluster_1.11.11
>
>
>
> 	[[alternative HTML version deleted]]
>
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list