[BioC] VST advice (Illumina microarray)
Pan Du
dupan at northwestern.edu
Mon Jan 28 17:03:08 CET 2008
Hi Mark,
I found the reason. The reason is that you plotted the mean-sd plot based on
all samples. The big variances among different sample types are normal, and
these variances are located in the high intensity range. If you plot the
mean-sd plot only based on the same type of samples, then the plot will be
fine (variance may still exist among replicates due to uncontrolled
reasons).
Pleas have a try of the following code:
for(i in 1:4){
meanSdPlot(normDataList[[i]][,1:5], main=names(normDataList)[i])
}
Tell me if this is not clear for you. Thanks!
Have a nice day,
Pan
On 1/28/08 8:02 AM, "Mark Dunning" <md392 at cam.ac.uk> wrote:
>>
>>> Hi guys,
>>>
>>> Hope you're both well. I just read your VST paper in NAR. It seems like
>>> a good method and I am keen to use it for my own data.
>>>
>>> To begin with I have been looking at data from the MAQC project - a
>>> dilution series using Human-6 chips and have been following the code in
>>> your vignette. I have been able to reproduce the results in the vignette
>>> for the Barnes data, but not for this data. I am attaching the
>>> 'meanSdPlot' I obtain and as you can see there is an increase of sd
>>> after about 30,000, which is not the behaviour I would expect. Would you
>>> be able to suggest what the problem might be?
>>>
>>>
>>> I have a lumi-batch object created using the non-normalised data for 19
>>> of the MAQC arrays and here is the summary.
>>>
>>>> summary(exprs(x.lumi))
>>> ILM_1_A1 ILM_1_A2 ILM_1_A3 ILM_1_A4
>>> Min. : 30.2 Min. : 33.3 Min. : 33.8 Min. : 33.8
>>> 1st Qu.: 52.5 1st Qu.: 51.1 1st Qu.: 52.1 1st Qu.: 52.8
>>> Median : 59.0 Median : 57.7 Median : 59.0 Median : 59.6
>>> Mean : 271.0 Mean : 268.3 Mean : 303.4 Mean : 280.4
>>> 3rd Qu.: 103.3 3rd Qu.: 102.2 3rd Qu.: 111.4 3rd Qu.: 105.9
>>> Max. :33398.3 Max. :30155.3 Max. :35888.5 Max. :32372.4
>>> ILM_1_A5 ILM_1_B1 ILM_1_B2 ILM_1_B3
>>> Min. : 35.9 Min. : 33.4 Min. : 32.7 Min. : 31.7
>>> 1st Qu.: 60.6 1st Qu.: 50.4 1st Qu.: 50.8 1st Qu.: 47.9
>>> Median : 68.2 Median : 56.6 Median : 57.1 Median : 54.4
>>> Mean : 331.1 Mean : 235.5 Mean : 245.5 Mean : 251.0
>>> 3rd Qu.: 122.8 3rd Qu.: 99.3 3rd Qu.: 102.4 3rd Qu.: 103.4
>>> Max. :38188.5 Max. :38868.7 Max. :37145.1 Max. :35338.0
>>> ILM_1_B4 ILM_1_B5 ILM_1_C1 ILM_1_C2
>>> Min. : 32.0 Min. : 32.4 Min. : 33.8 Min. : 33.2
>>> 1st Qu.: 51.5 1st Qu.: 59.7 1st Qu.: 53.3 1st Qu.: 52.6
>>> Median : 58.2 Median : 67.1 Median : 60.1 Median : 59.5
>>> Mean : 258.9 Mean : 285.2 Mean : 261.9 Mean : 279.0
>>> 3rd Qu.: 106.1 3rd Qu.: 120.7 3rd Qu.: 113.2 3rd Qu.: 116.6
>>> Max. :37861.9 Max. :40534.5 Max. :27707.1 Max. :31168.6
>>> ILM_1_C4 ILM_1_C5 ILM_1_D1 ILM_1_D2
>>> Min. : 32.8 Min. : 34.1 Min. : 32.0 Min. : 30.2
>>> 1st Qu.: 52.3 1st Qu.: 57.6 1st Qu.: 50.2 1st Qu.: 52.1
>>> Median : 59.1 Median : 65.5 Median : 57.1 Median : 58.9
>>> Mean : 270.2 Mean : 296.7 Mean : 250.3 Mean : 258.9
>>> 3rd Qu.: 114.6 3rd Qu.: 125.7 3rd Qu.: 112.5 3rd Qu.: 116.0
>>> Max. :29069.5 Max. :33560.1 Max. :32544.1 Max. :35139.3
>>> ILM_1_D3 ILM_1_D4 ILM_1_D5
>>> Min. : 35.8 Min. : 32.5 Min. : 37.1
>>> 1st Qu.: 54.6 1st Qu.: 58.7 1st Qu.: 59.4
>>> Median : 61.6 Median : 66.7 Median : 67.6
>>> Mean : 268.1 Mean : 289.6 Mean : 300.5
>>> 3rd Qu.: 121.2 3rd Qu.: 131.0 3rd Qu.: 134.0
>>> Max. :36666.1 Max. :37621.7 Max. :41185.4
>>>
>>>
>>> I then apply the following transforms and create a normalised data
>>> object as in the vignette.
>>>
>>> x.lumi.vst <- lumiT(x.lumi)
>>> x.lumi.vst.quantile <- lumiN(x.lumi.vst, method='quantile')
>>>
>>> ## log2 transform and Quantile normalization
>>> x.lumi.log <- lumiT(x.lumi, method='log2')
>>> x.lumi.log.quantile <- lumiN(x.lumi.log, method='quantile')
>>>
>>> x.lumi.vsn <- lumiN(x.lumi, method='vsn', lts.quantile=0.5)
>>>
>>>
>>> normDataList <- list('Raw.Log2'=exprs(x.lumi.log),
>>> 'VST.Quantile'=exprs(x.lumi.vst.quantile),
>>> 'Log2.Quantile'=exprs(x.lumi.log.quantile),
>>> 'VSN'=exprs(x.lumi.vsn))
>>>
>>> However when I run
>>>
>>>> for(i in 1:4){
>>> + meanSdPlot(normDataList[[i]], main=names(normDataList)[i])
>>> + }
>>>
>>>
>>> ...I get the attached picture. It does not seem that VST is working, or
>>> maybe I have done something wrong. Has VST been used on Human-6 data
>>> before and is there some special trick I need to use?
>>>
>>> Any help you could give would be greatly appreciated
>>>
>>> Best wishes,
>>>
>>> Mark
>>>
More information about the Bioconductor
mailing list