[BioC] lumi: how is the controlData to be read and used?
Pan Du
dupan at northwestern.edu
Tue Oct 30 00:00:23 CET 2007
Hi Gordon,
Sorry for replying late. I think that should work because the
Control_Gene_Profile.txt file basically averaged the negative control
probes. As described in the BeadStudio manual, its background adjustment
basically subtact the mean of negative control probes. But I am not sure
whether BeadStudio did outlier removal or not. Anyway, the results should be
close.
Also I will update lumiR function (or write a new function) to read the
Control_Probe_Profile.txt because the negative control probes have the same
probe Ids. Thanks!
Pan
On 10/28/07 9:03 PM, "Gordon Smyth" <smyth at wehi.EDU.AU> wrote:
> At 10:17 PM 28/10/2007, Pan Du wrote:
>> What I mean here for the using of control Probe data is using control Probe
>> information for the quality control information. For the background
>> adjustment part, currently, we believe using the BeadStudio recommended
>> method works well. Of course further improvement is possible. The
>> contribution in this part is very welcome.
>
> OK, good, now we're getting somewhere. You're recommending
> BeadStudio's global background correction. Let me now rephrase my
> original question. Suppose that I have BeadStudio output data which
> is not background corrected. How can I use R to reproduce the
> background correction that BeadStudio would have done?
>
> This is a very important question, because most Bioconductor users of
> the lumi package will I guess have Illumina output data which is not
> normalized and not background corrected. And we will not necessarily
> want to go back to BeadStudio to background correct.
>
> I have summary probe profile data output from BeadStudio which is not
> background corrected. Let me repeat, it is not background corrected.
>
> Sample_Probe_Profile.txt
>
> I also have control probe summary profiles and control gene summary
> profiles. This includes both positive and negative control probes:
>
> Control_Probe_Profile.txt
> Control_Gene_Profile.txt
>
> I should surely be able to reproduce BeadStudio's background
> correction. Here is my best effort using the lumi package. Is this
> what you recommend?
>
> library(lumi)
> x <- lumiR("Sample_Probe_Profile.txt")
> controlgp <- lumiR("Control_Gene_Profile.txt")
> x at controlData <- as.data.frame(exprs(controlgp))
> xb <- lumiB(x,method="bgAdjust")
> y <- lumiT(xb,method="vst")
> y <- lumiN(y,method="quantile")
>
> As you can see from the results below, lumiB() simply subtracted the
> negative control expression value from the expression values for each array.
>
> Best wishes
> Gordon
>
>
>> exprs(controlgp)[,1:4]
> 1957998084_A 1957998084_B 1957998084_C 1957998084_D
> biotin 11508.6 10857.9 10641.8 10536.3
> cy3_hyb 20252.0 19227.1 18964.8 19457.2
> high_stringency_hyb 47593.1 43267.2 43966.6 43207.8
> housekeeping 16185.3 14039.6 13277.5 13280.2
> labeling 85.2 89.5 77.4 80.7
> low_stringency_hyb 17650.5 16441.4 16330.1 16844.8
> negative 92.0 90.0 83.2 88.1
>> summary(exprs(x)[,1:4])
> 1957998084_A 1957998084_B 1957998084_C 1957998084_D
> Min. : 52.9 Min. : 50.2 Min. : 48.6 Min. : 54.1
> 1st Qu.: 86.6 1st Qu.: 84.3 1st Qu.: 78.2 1st Qu.: 82.3
> Median : 99.0 Median : 96.6 Median : 88.7 Median : 93.9
> Mean : 511.4 Mean : 501.0 Mean : 400.3 Mean : 448.0
> 3rd Qu.: 163.9 3rd Qu.: 159.3 3rd Qu.: 138.3 3rd Qu.: 148.9
> Max. :59875.4 Max. :57223.1 Max. :50414.0 Max. :49213.6
>> summary(exprs(xb)[,1:4])
> 1957998084_A 1957998084_B 1957998084_C 1957998084_D
> Min. : -39.09 Min. : -39.83 Min. : -34.64 Min. : -34.08
> 1st Qu.: -5.40 1st Qu.: -5.73 1st Qu.: -5.01 1st Qu.: -5.80
> Median : 7.05 Median : 6.65 Median : 5.48 Median : 5.76
> Mean : 419.47 Mean : 411.01 Mean : 317.04 Mean : 359.90
> 3rd Qu.: 71.95 3rd Qu.: 69.27 3rd Qu.: 55.08 3rd Qu.: 60.77
> Max. :59783.48 Max. :57133.12 Max. :50330.79 Max. :49125.42
>
>
>
More information about the Bioconductor
mailing list