[BioC] lumi and plotHousekeepingGene
Janet Young
jayoung at fhcrc.org
Thu Jun 2 03:29:17 CEST 2011
Hi,
I have a set of Illumina arrays and have been playing with lumi a little. It seems really useful - thank you very much.
I'm pretty new to array analysis, so I'm not sure if this is a bug or intended behavior, but here goes: I've found that plotHousekeepingGene behaves a little oddly with our data (for now, processed using lumiExpresso with default settings). I think the problem is caused by the following portion of the function, that subtracts the minimum control value from all the control datapoints before taking the log and plotting. What's the rationale for that subtraction? I might be missing something.
if (logMode) {
if (max(selControlData) > 50) {
selControlData <- selControlData - min(selControlData) +
1
selControlData <- log2(selControlData)
}
ylab <- "Expression Amplitude (log2)"
}
In the plotHousekeepingGene of our data, one of the housekeeping genes looks quite bad, so initially I was concerned: that gene has a lot lower expression than the rest of them, and expression appears to vary a lot across the arrays. Expression is indeed low, but it doesn't really vary much across the arrays when I look at the normalized data myself, so in reality I don't think I need to worry too much (although I will be checking in with the biologists about whether that gene should be high or low in the cells they've assayed).
Here's the control data that got plotted by plotHousekeepingGene, i.e. the control data after that subtraction of the minimum value (probe 101 is low, and varies widely across arrays). If there is a good rationale for the subtraction step, maybe I should actually be worried about this gene?
array_1 array_2 array_3 array_4 array_5 array_6
101 4.307429 7.742815 8.274728 7.74685 0.00000 7.499049
102 14.052568 14.134073 14.201274 14.12779 14.14360 14.181657
103 14.667866 14.725973 14.759108 14.59437 14.53636 14.606914
104 13.095512 12.862831 12.729939 13.30600 13.08711 13.063597
105 13.768515 13.506642 14.023174 13.58535 13.93239 14.050444
106 13.313818 12.773840 12.792241 13.15706 13.29373 12.848232
107 13.916514 13.466714 13.714310 13.49404 13.82293 13.475049
But here's how the control data looks when I just take the log2 myself (probe 101 is somewhat low, but fairly constant across arrays, and not as low as the negative controls from the same arrays - their values tend to be around 6-7).
array_1 array_2 array_3 array_4 array_5 array_6
101 8.21820 8.943101 9.198936 8.944858 8.124121 8.842036
102 14.07598 14.156210 14.222410 14.150025 14.165590 14.203080
103 14.68319 14.740698 14.773500 14.610494 14.553143 14.622898
104 13.14062 12.915693 12.787801 13.345073 13.132484 13.109700
105 13.79697 13.540697 14.047064 13.617617 13.957818 14.073891
106 13.35268 12.830000 12.847703 13.200316 13.333127 12.901621
107 13.94222 13.501713 13.743846 13.528393 13.850343 13.509849
Hope this is helpful...
Thanks very much,
Janet Young
-------------------------------------------------------------------
Dr. Janet Young
Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.
tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung ...at... fhcrc.org
-------------------------------------------------------------------
More information about the Bioconductor
mailing list