[BioC] quantile robust and RMA in xps

cstrato cstrato at aon.at
Thu May 28 23:49:38 CEST 2009


Dear Mayte,

Thank you for sending me your code. I have just run your code using the 
Affymetrix breast, heart, prostate HuGene data, and everything is ok, 
including the final boxplot. However, the results are slightly different 
when I replace "normalize.quantiles()" from package preProcessCore with 
"normalize.quantiles()" from xps. Furthermore I have found two potential 
problems in your code:

1, To simulate the removal of chips as output from 
"normalize.quantiles.robust()" I have deleted one column. Thus I also 
needed to delete the corresponding column name.  Since you have set 
"n.remove=5", up to 5 columns may be removed, so you would need to 
remove these column names manually.

2, Since you store your original CEL-files in your working directory,  
replacement function "intensity<-" did replace your original CEL-files 
with the background corrected CEL-files of the same name. There are two 
ways to prevent this. The best way is to store the original CEL-files in 
another directory, e.g. in "raw".  The second possibility is to use 
parameter "celnames" of function "import.data()" to rename the imported 
CEL-files.

A third problem may be that function "normalize.quantiles.robust()" 
re-orders the matrix, so that the (x,y)-coordinates are no longer 
correct. Although this should not be the case I cannot exclude this 
possibility.

Here is the complete code that I used for testing.

- - - - - - - - - - - - - - - - - - - - - - - - -
 ### new R session: load library xps
library(xps)
scmdir <- "/Volumes/GigaDrive/CRAN/Workspaces/Schemes"
scheme.hugene10stv1r4 <- root.scheme(paste(scmdir, 
"Scheme_HuGene10stv1r4_na28.root",sep = "/"))

celfiles <- c("Breast_01.CEL", "Breast_02.CEL", "Breast_03.CEL", 
"Heart_01.CEL", "Heart_02.CEL", "Heart_03.CEL", "Prostate_01.CEL", 
"Prostate_02.CEL", "Prostate_03.CEL")
G1ST_data2<-import.data(scheme.hugene10stv1r4, 
"Pamela_NOMID_dataxps_162021", celdir=getwd(), celfiles=celfiles, 
verbose=TRUE)

## RMA background
data.bg.rma <- 
bgcorrect(G1ST_data2,"tmp_bg",method="rma",exonlevel="core+affx", 
select="antigenomic", option="pmonly:epanechnikov",params=c(16384))

# get intensities
data.bg.rma<-attachInten(data.bg.rma)
data.int<-intensity(data.bg.rma)

# normalize with affy functions
detach(package:xps)
library(preprocessCore)
data.int.norm<-normalize.quantiles.robust(as.matrix(data.int[,-c(1,2)]),n.remove=2,remove.extreme='both')
# manually remove one chip
data.int.norm <- data.int.norm[,-4]

# replace intensity slot
library(xps)
aaa<-as.data.frame(cbind(data.int[,c(1,2)],data.int.norm))
# Problem: need to remove colname of chip which was removed
colnames(aaa)<-colnames(data.int)[-6]
intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- aaa
# Problem: does overwrite original CEL-files
boxplot(data.bg.rma) #boxplot is OK

## summarize medianpolish
setName(data.bg.rma) <- "DataSet"
data.mp.rma <- 
summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core+affx")
boxplot(data.mp.rma)
- - - - - - - - - - - - - - - - - - - - - - - - -

Best regards
Christian


Mayte Suarez-Farinas wrote:
>> Dear Christian,
> I am sorry I need to bother you gain !
> Everything worked fine with the background correction, the quantile 
> normalization and the substitution 
> using function "intensity()<-". When I do the boxplot after this, teh 
> data is normalized. Then when I use summarize.rma,
> after that the data is not normalized anymore.
>
> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm
> boxplot(data.bg.rma)                          ## Boxplot is perfect!
> setName(data.bg.rma) <- "DataSet"
> data.mp.rma <- 
> summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core+affx")
> boxplot(data.mp.rma)                                                   
>   #Boxplot is NOT ok
>
> any hint?
>
> Best
>
> Mayte
>
>
>>
>> The new replacement method "intensity()<-" has an option to create a 
>> new ROOT file (see?intensity), thus you need to do:
>>
>> library(xps)
>> str(data.int)
>>
>> data.int.norm <- as.data.frame(cbind(data.int[,c(1,2)],data.int.norm))
>>
>> Here you see that I added the (x,y) coordinates, but it is up to you 
>> to make sure that the order is correct.
>> I am using cbind() to prevent cycling of the samples, which is what I 
>> get when using "data.int[,-c(1,2)]".
>>
>> Now I can use the replacement method:
>>
>> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm
>> str(data.bg.rma)
>> boxplot(data.bg.rma) #boxplot is OK
>>
>> Please note that this will take some time since the 
>> background-corrected intensities will first be saved as CEL-files 
>> which are then imported into the new ROOT file "tmp_int2_cel.root".
>>
>> Now you can summarize the data using xps, but you need to replace the 
>> setname first:
>>
>> setName(data.bg.rma) <- "DataSet"
>> data.mp.rma <- summarize.rma(data.bg.rma, "tmp_sum_rma", 
>> exonlevel="core+affx")
>> boxplot(data.mp.rma) #boxplot is now OK.
>>
>> I hope this helps.
>> Best regards
>> Christian
>>
>>



More information about the Bioconductor mailing list