[BioC] lumi: how is the controlData to be read and used?

Gordon Smyth smyth at wehi.EDU.AU
Mon Oct 29 03:03:32 CET 2007

At 10:17 PM 28/10/2007, Pan Du wrote:
>What I mean here for the using of control Probe data is using control Probe
>information for the quality control information. For the background
>adjustment part, currently, we believe using the BeadStudio recommended
>method works well. Of course further improvement is possible. The
>contribution in this part is very welcome.

OK, good, now we're getting somewhere. You're recommending 
BeadStudio's global background correction. Let me now rephrase my 
original question. Suppose that I have BeadStudio output data which 
is not background corrected. How can I use R to reproduce the 
background correction that BeadStudio would have done?

This is a very important question, because most Bioconductor users of 
the lumi package will I guess have Illumina output data which is not 
normalized and not background corrected. And we will not necessarily 
want to go back to BeadStudio to background correct.

I have summary probe profile data output from BeadStudio which is not 
background corrected. Let me repeat, it is not background corrected.


I also have control probe summary profiles and control gene summary 
profiles. This includes both positive and negative control probes:


I should surely be able to reproduce BeadStudio's background 
correction. Here is my best effort using the lumi package. Is this 
what you recommend?

   x <- lumiR("Sample_Probe_Profile.txt")
   controlgp <- lumiR("Control_Gene_Profile.txt")
   x at controlData <- as.data.frame(exprs(controlgp))
   xb <- lumiB(x,method="bgAdjust")
   y <- lumiT(xb,method="vst")
   y <- lumiN(y,method="quantile")

As you can see from the results below, lumiB() simply subtracted the 
negative control expression value from the expression values for each array.

Best wishes

 > exprs(controlgp)[,1:4]
                     1957998084_A 1957998084_B 1957998084_C 1957998084_D
biotin                   11508.6      10857.9      10641.8      10536.3
cy3_hyb                  20252.0      19227.1      18964.8      19457.2
high_stringency_hyb      47593.1      43267.2      43966.6      43207.8
housekeeping             16185.3      14039.6      13277.5      13280.2
labeling                    85.2         89.5         77.4         80.7
low_stringency_hyb       17650.5      16441.4      16330.1      16844.8
negative                    92.0         90.0         83.2         88.1
 > summary(exprs(x)[,1:4])
   1957998084_A      1957998084_B      1957998084_C      1957998084_D
  Min.   :   52.9   Min.   :   50.2   Min.   :   48.6   Min.   :   54.1
  1st Qu.:   86.6   1st Qu.:   84.3   1st Qu.:   78.2   1st Qu.:   82.3
  Median :   99.0   Median :   96.6   Median :   88.7   Median :   93.9
  Mean   :  511.4   Mean   :  501.0   Mean   :  400.3   Mean   :  448.0
  3rd Qu.:  163.9   3rd Qu.:  159.3   3rd Qu.:  138.3   3rd Qu.:  148.9
  Max.   :59875.4   Max.   :57223.1   Max.   :50414.0   Max.   :49213.6
 > summary(exprs(xb)[,1:4])
   1957998084_A       1957998084_B       1957998084_C       1957998084_D
  Min.   :  -39.09   Min.   :  -39.83   Min.   :  -34.64   Min.   :  -34.08
  1st Qu.:   -5.40   1st Qu.:   -5.73   1st Qu.:   -5.01   1st Qu.:   -5.80
  Median :    7.05   Median :    6.65   Median :    5.48   Median :    5.76
  Mean   :  419.47   Mean   :  411.01   Mean   :  317.04   Mean   :  359.90
  3rd Qu.:   71.95   3rd Qu.:   69.27   3rd Qu.:   55.08   3rd Qu.:   60.77
  Max.   :59783.48   Max.   :57133.12   Max.   :50330.79   Max.   :49125.42

More information about the Bioconductor mailing list