[BioC] VST advice (Illumina microarray)

Mon Jan 28 17:03:08 CET 2008

Hi Mark,

I found the reason. The reason is that you plotted the mean-sd plot based on
all samples. The big variances among different sample types are normal, and
these variances are located in the high intensity range. If you plot the
mean-sd plot only based on the same type of samples, then the plot will be
fine (variance may still exist among replicates due to uncontrolled
reasons). 

Pleas have a try of the following code:
for(i in 1:4){ 
   meanSdPlot(normDataList[[i]][,1:5], main=names(normDataList)[i])
}

Tell me if this is not clear for you. Thanks!

Have a nice day,

Pan

On 1/28/08 8:02 AM, "Mark Dunning" <md392 at cam.ac.uk> wrote:
>> 
>>> Hi guys,
>>> 
>>> Hope you're both well. I just read your VST paper in NAR. It seems like
>>> a good method and I am keen to use it for my own data.
>>> 
>>> To begin with I have been looking at data from the MAQC project - a
>>> dilution series using Human-6 chips and have been following the code in
>>> your vignette. I have been able to reproduce the results in the vignette
>>> for the Barnes data, but not for this data. I am attaching the
>>> 'meanSdPlot' I obtain and as you can see there is an increase of sd
>>> after about 30,000, which is not the behaviour I would expect. Would you
>>> be able to suggest what the problem might be?
>>> 
>>> 
>>> I have a lumi-batch object created using the non-normalised data for 19
>>> of the MAQC arrays and here is the summary.
>>> 
>>>> summary(exprs(x.lumi))
>>>     ILM_1_A1          ILM_1_A2          ILM_1_A3          ILM_1_A4
>>>  Min.   :   30.2   Min.   :   33.3   Min.   :   33.8   Min.   :   33.8
>>>  1st Qu.:   52.5   1st Qu.:   51.1   1st Qu.:   52.1   1st Qu.:   52.8
>>>  Median :   59.0   Median :   57.7   Median :   59.0   Median :   59.6
>>>  Mean   :  271.0   Mean   :  268.3   Mean   :  303.4   Mean   :  280.4
>>>  3rd Qu.:  103.3   3rd Qu.:  102.2   3rd Qu.:  111.4   3rd Qu.:  105.9
>>>  Max.   :33398.3   Max.   :30155.3   Max.   :35888.5   Max.   :32372.4
>>>     ILM_1_A5          ILM_1_B1          ILM_1_B2          ILM_1_B3
>>>  Min.   :   35.9   Min.   :   33.4   Min.   :   32.7   Min.   :   31.7
>>>  1st Qu.:   60.6   1st Qu.:   50.4   1st Qu.:   50.8   1st Qu.:   47.9
>>>  Median :   68.2   Median :   56.6   Median :   57.1   Median :   54.4
>>>  Mean   :  331.1   Mean   :  235.5   Mean   :  245.5   Mean   :  251.0
>>>  3rd Qu.:  122.8   3rd Qu.:   99.3   3rd Qu.:  102.4   3rd Qu.:  103.4
>>>  Max.   :38188.5   Max.   :38868.7   Max.   :37145.1   Max.   :35338.0
>>>     ILM_1_B4          ILM_1_B5          ILM_1_C1          ILM_1_C2
>>>  Min.   :   32.0   Min.   :   32.4   Min.   :   33.8   Min.   :   33.2
>>>  1st Qu.:   51.5   1st Qu.:   59.7   1st Qu.:   53.3   1st Qu.:   52.6
>>>  Median :   58.2   Median :   67.1   Median :   60.1   Median :   59.5
>>>  Mean   :  258.9   Mean   :  285.2   Mean   :  261.9   Mean   :  279.0
>>>  3rd Qu.:  106.1   3rd Qu.:  120.7   3rd Qu.:  113.2   3rd Qu.:  116.6
>>>  Max.   :37861.9   Max.   :40534.5   Max.   :27707.1   Max.   :31168.6
>>>     ILM_1_C4          ILM_1_C5          ILM_1_D1          ILM_1_D2
>>>  Min.   :   32.8   Min.   :   34.1   Min.   :   32.0   Min.   :   30.2
>>>  1st Qu.:   52.3   1st Qu.:   57.6   1st Qu.:   50.2   1st Qu.:   52.1
>>>  Median :   59.1   Median :   65.5   Median :   57.1   Median :   58.9
>>>  Mean   :  270.2   Mean   :  296.7   Mean   :  250.3   Mean   :  258.9
>>>  3rd Qu.:  114.6   3rd Qu.:  125.7   3rd Qu.:  112.5   3rd Qu.:  116.0
>>>  Max.   :29069.5   Max.   :33560.1   Max.   :32544.1   Max.   :35139.3
>>>     ILM_1_D3          ILM_1_D4          ILM_1_D5
>>>  Min.   :   35.8   Min.   :   32.5   Min.   :   37.1
>>>  1st Qu.:   54.6   1st Qu.:   58.7   1st Qu.:   59.4
>>>  Median :   61.6   Median :   66.7   Median :   67.6
>>>  Mean   :  268.1   Mean   :  289.6   Mean   :  300.5
>>>  3rd Qu.:  121.2   3rd Qu.:  131.0   3rd Qu.:  134.0
>>>  Max.   :36666.1   Max.   :37621.7   Max.   :41185.4
>>> 
>>> 
>>> I then apply the following transforms and create a normalised data
>>> object as in the vignette.
>>> 
>>> x.lumi.vst <- lumiT(x.lumi)
>>> x.lumi.vst.quantile <- lumiN(x.lumi.vst, method='quantile')
>>> 
>>> ## log2 transform and Quantile normalization
>>> x.lumi.log <- lumiT(x.lumi, method='log2')
>>> x.lumi.log.quantile <- lumiN(x.lumi.log, method='quantile')
>>> 
>>> x.lumi.vsn <- lumiN(x.lumi, method='vsn', lts.quantile=0.5)
>>> 
>>> 
>>> normDataList <- list('Raw.Log2'=exprs(x.lumi.log),
>>>                      'VST.Quantile'=exprs(x.lumi.vst.quantile),
>>>                      'Log2.Quantile'=exprs(x.lumi.log.quantile),
>>>             'VSN'=exprs(x.lumi.vsn))
>>> 
>>> However when I run
>>> 
>>>> for(i in 1:4){
>>> + meanSdPlot(normDataList[[i]], main=names(normDataList)[i])
>>> + }
>>> 
>>> 
>>> ...I get the attached picture. It does not seem that VST is working, or
>>> maybe I have done something wrong. Has VST been used on Human-6 data
>>> before and is there some special trick I need to use?
>>> 
>>> Any help you could give would be greatly appreciated
>>> 
>>> Best wishes,
>>> 
>>> Mark 
>>>