[BioC] quantile robust and RMA in xps
Mayte Suarez Farinas
farinam at mail.rockefeller.edu
Fri May 29 02:21:43 CEST 2009
>Dear Mayte,
>
>Thank you for sending me your code. I have just run your code using the
>Affymetrix breast, heart, prostate HuGene data, and everything is ok,
>including the final boxplot. However, the results are slightly different
>when I replace "normalize.quantiles()" from package preProcessCore with
>"normalize.quantiles()" from xps. Furthermore I have found two potential
>problems in your code:
>
>1, To simulate the removal of chips as output from
>"normalize.quantiles.robust()" I have deleted one column. Thus I also
>needed to delete the corresponding column name. Since you have set
>"n.remove=5", up to 5 columns may be removed, so you would need to
>remove these column names manually.
I do not want to remove those chips from the output. the remove option on
normalize.quantiles.robust do not remove the columns per se, it only uses the
remaining chips to calculate the reference distribution to which all the chips
(including the 'removed') are normalized. This step produces a matrix with the same
number of columns as the original. I don't want to removed those chips, only not to
include them in the estimation of the reference distribution (if I include them the
reference distribution has utterly low range and I want to avoid to delete those
chips, their qc is not that bad)
>
>2, Since you store your original CEL-files in your working directory,
>replacement function "intensity<-" did replace your original CEL-files
>with the background corrected CEL-files of the same name. There are two
>ways to prevent this. The best way is to store the original CEL-files in
>another directory, e.g. in "raw". The second possibility is to use
>parameter "celnames" of function "import.data()" to rename the imported
>CEL-files.
Do you mean that my CEL files were changed?. they are not more the raw data ?
Oh Oh! I was not aware of that. Oh my! I hope that the facility kept a copy of my
CEL files otherwise I am in deep trouble!
>
>A third problem may be that function "normalize.quantiles.robust()"
>re-orders the matrix, so that the (x,y)-coordinates are no longer
>correct. Although this should not be the case I cannot exclude this
>possibility.
Maybe Bolstad can answer this...
>
>Here is the complete code that I used for testing.
>
>- - - - - - - - - - - - - - - - - - - - - - - - -
> ### new R session: load library xps
>library(xps)
>scmdir <- "/Volumes/GigaDrive/CRAN/Workspaces/Schemes"
>scheme.hugene10stv1r4 <- root.scheme(paste(scmdir,
>"Scheme_HuGene10stv1r4_na28.root",sep = "/"))
>
>celfiles <- c("Breast_01.CEL", "Breast_02.CEL", "Breast_03.CEL",
>"Heart_01.CEL", "Heart_02.CEL", "Heart_03.CEL", "Prostate_01.CEL",
>"Prostate_02.CEL", "Prostate_03.CEL")
>G1ST_data2<-import.data(scheme.hugene10stv1r4,
>"Pamela_NOMID_dataxps_162021", celdir=getwd(), celfiles=celfiles,
>verbose=TRUE)
>
>## RMA background
>data.bg.rma <-
>bgcorrect(G1ST_data2,"tmp_bg",method="rma",exonlevel="core+affx",
>select="antigenomic", option="pmonly:epanechnikov",params=c(16384))
>
># get intensities
>data.bg.rma<-attachInten(data.bg.rma)
>data.int<-intensity(data.bg.rma)
>
># normalize with affy functions
>detach(package:xps)
>library(preprocessCore)
>data.int.norm<-normalize.quantiles.robust(as.matrix(data.int[,-
c(1,2)]),n.remove=2,remove.extreme='both')
># manually remove one chip
>data.int.norm <- data.int.norm[,-4]
>
># replace intensity slot
>library(xps)
>aaa<-as.data.frame(cbind(data.int[,c(1,2)],data.int.norm))
># Problem: need to remove colname of chip which was removed
>colnames(aaa)<-colnames(data.int)[-6]
>intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- aaa
># Problem: does overwrite original CEL-files
>boxplot(data.bg.rma) #boxplot is OK
>
>## summarize medianpolish
>setName(data.bg.rma) <- "DataSet"
>data.mp.rma <-
>summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core+affx")
>boxplot(data.mp.rma)
>- - - - - - - - - - - - - - - - - - - - - - - - -
>
>Best regards
>Christian
>
>
>Mayte Suarez-Farinas wrote:
>>> Dear Christian,
>> I am sorry I need to bother you gain !
>> Everything worked fine with the background correction, the quantile
>> normalization and the substitution
>> using function "intensity()<-". When I do the boxplot after this, teh
>> data is normalized. Then when I use summarize.rma,
>> after that the data is not normalized anymore.
>>
>> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm
>> boxplot(data.bg.rma) ## Boxplot is perfect!
>> setName(data.bg.rma) <- "DataSet"
>> data.mp.rma <-
>> summarize.rma(data.bg.rma,"tmp_sum_rma",exonlevel="core+affx")
>> boxplot(data.mp.rma)
>> #Boxplot is NOT ok
>>
>> any hint?
>>
>> Best
>>
>> Mayte
>>
>>
>>>
>>> The new replacement method "intensity()<-" has an option to create a
>>> new ROOT file (see?intensity), thus you need to do:
>>>
>>> library(xps)
>>> str(data.int)
>>>
>>> data.int.norm <- as.data.frame(cbind(data.int[,c(1,2)],data.int.norm))
>>>
>>> Here you see that I added the (x,y) coordinates, but it is up to you
>>> to make sure that the order is correct.
>>> I am using cbind() to prevent cycling of the samples, which is what I
>>> get when using "data.int[,-c(1,2)]".
>>>
>>> Now I can use the replacement method:
>>>
>>> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm
>>> str(data.bg.rma)
>>> boxplot(data.bg.rma) #boxplot is OK
>>>
>>> Please note that this will take some time since the
>>> background-corrected intensities will first be saved as CEL-files
>>> which are then imported into the new ROOT file "tmp_int2_cel.root".
>>>
>>> Now you can summarize the data using xps, but you need to replace the
>>> setname first:
>>>
>>> setName(data.bg.rma) <- "DataSet"
>>> data.mp.rma <- summarize.rma(data.bg.rma, "tmp_sum_rma",
>>> exonlevel="core+affx")
>>> boxplot(data.mp.rma) #boxplot is now OK.
>>>
>>> I hope this helps.
>>> Best regards
>>> Christian
>>>
>>>
>
>
>
---------End of Included Message----------
Mayte Suarez Farinas
Research Associate
The Rockefeller University Hospital
1230 York Avenue, Box 178
New York, NY 10021
(212) 327-8213 - phone
(212) 327-7422 - fax
farinam at rockefeller.edu
More information about the Bioconductor
mailing list