[BioC] quantile robust and RMA in xps
cstrato
cstrato at aon.at
Thu May 28 23:49:38 CEST 2009
Dear Mayte,
Thank you for sending me your code. I have just run your code using the
Affymetrix breast, heart, prostate HuGene data, and everything is ok,
including the final boxplot. However, the results are slightly different
when I replace "normalize.quantiles()" from package preProcessCore with
"normalize.quantiles()" from xps. Furthermore I have found two potential
problems in your code:
1, To simulate the removal of chips as output from
"normalize.quantiles.robust()" I have deleted one column. Thus I also
needed to delete the corresponding column name. Since you have set
"n.remove=5", up to 5 columns may be removed, so you would need to
remove these column names manually.
2, Since you store your original CEL-files in your working directory,
replacement function "intensity<-" did replace your original CEL-files
with the background corrected CEL-files of the same name. There are two
ways to prevent this. The best way is to store the original CEL-files in
another directory, e.g. in "raw". The second possibility is to use
parameter "celnames" of function "import.data()" to rename the imported
CEL-files.
A third problem may be that function "normalize.quantiles.robust()"
re-orders the matrix, so that the (x,y)-coordinates are no longer
correct. Although this should not be the case I cannot exclude this
possibility.
Here is the complete code that I used for testing.
- - - - - - - - - - - - - - - - - - - - - - - - -
### new R session: load library xps
library(xps)
scmdir <- "/Volumes/GigaDrive/CRAN/Workspaces/Schemes"
scheme.hugene10stv1r4 <- root.scheme(paste(scmdir,
"Scheme_HuGene10stv1r4_na28.root",sep = "/"))
celfiles <- c("Breast_01.CEL", "Breast_02.CEL", "Breast_03.CEL",
"Heart_01.CEL", "Heart_02.CEL", "Heart_03.CEL", "Prostate_01.CEL",
"Prostate_02.CEL", "Prostate_03.CEL")
G1ST_data2<-import.data(scheme.hugene10stv1r4,
"Pamela_NOMID_dataxps_162021", celdir=getwd(), celfiles=celfiles,
verbose=TRUE)
## RMA background
data.bg.rma <-
bgcorrect(G1ST_data2,"tmp_bg",method="rma",exonlevel="core+affx",
select="antigenomic", option="pmonly:epanechnikov",params=c(16384))
# get intensities
data.bg.rma<-attachInten(data.bg.rma)
data.int<-intensity(data.bg.rma)
# normalize with affy functions
detach(package:xps)
library(preprocessCore)
data.int.norm<-normalize.quantiles.robust(as.matrix(data.int[,-c(1,2)]),n.remove=2,remove.extreme='both')
# manually remove one chip
data.int.norm <- data.int.norm[,-4]
# replace intensity slot
library(xps)
aaa<-as.data.frame(cbind(data.int[,c(1,2)],data.int.norm))
# Problem: need to remove colname of chip which was removed
colnames(aaa)<-colnames(data.int)[-6]
intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- aaa
# Problem: does overwrite original CEL-files
boxplot(data.bg.rma) #boxplot is OK
## summarize medianpolish
setName(data.bg.rma) <- "DataSet"
data.mp.rma <-
summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core+affx")
boxplot(data.mp.rma)
- - - - - - - - - - - - - - - - - - - - - - - - -
Best regards
Christian
Mayte Suarez-Farinas wrote:
>> Dear Christian,
> I am sorry I need to bother you gain !
> Everything worked fine with the background correction, the quantile
> normalization and the substitution
> using function "intensity()<-". When I do the boxplot after this, teh
> data is normalized. Then when I use summarize.rma,
> after that the data is not normalized anymore.
>
> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm
> boxplot(data.bg.rma) ## Boxplot is perfect!
> setName(data.bg.rma) <- "DataSet"
> data.mp.rma <-
> summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core+affx")
> boxplot(data.mp.rma)
> #Boxplot is NOT ok
>
> any hint?
>
> Best
>
> Mayte
>
>
>>
>> The new replacement method "intensity()<-" has an option to create a
>> new ROOT file (see?intensity), thus you need to do:
>>
>> library(xps)
>> str(data.int)
>>
>> data.int.norm <- as.data.frame(cbind(data.int[,c(1,2)],data.int.norm))
>>
>> Here you see that I added the (x,y) coordinates, but it is up to you
>> to make sure that the order is correct.
>> I am using cbind() to prevent cycling of the samples, which is what I
>> get when using "data.int[,-c(1,2)]".
>>
>> Now I can use the replacement method:
>>
>> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm
>> str(data.bg.rma)
>> boxplot(data.bg.rma) #boxplot is OK
>>
>> Please note that this will take some time since the
>> background-corrected intensities will first be saved as CEL-files
>> which are then imported into the new ROOT file "tmp_int2_cel.root".
>>
>> Now you can summarize the data using xps, but you need to replace the
>> setname first:
>>
>> setName(data.bg.rma) <- "DataSet"
>> data.mp.rma <- summarize.rma(data.bg.rma, "tmp_sum_rma",
>> exonlevel="core+affx")
>> boxplot(data.mp.rma) #boxplot is now OK.
>>
>> I hope this helps.
>> Best regards
>> Christian
>>
>>
More information about the Bioconductor
mailing list