[BioC] Question on limma single-channel analysis of dual-channel microarray
Deanne Taylor
dtaylor at hsph.harvard.edu
Sun Sep 16 03:21:38 CEST 2007
Thanks, Naomi.
When I looked at the simple limma analysis of the 3 biological replicates (disease vs normal) the results closely correlated to some immunohistochemistry results we had -- so I tend to believe (even tho a small sample size) the simple limma analysis results on the three biological replicates.
The differences in the IHC-measured genes in the single-channel approach did not match the IHC results at all, so I have a more difficult time believing the single-channel approach is approximating the biology.
However, I do want to look across disease and across dose, so the single channel method is my only option it seems.
I appreciate your suggestion (changing the intercept to "1"?) to look for deviations from the mean (as in model.matrix(~1 + f) with the single channel approach.
Deanne
---
Deanne Taylor PhD
Department of Biostatistics
Harvard School of Public Health
655 Huntington Avenue
Boston, MA 02115
dtaylor at hsph.harvard.edu
>>> Naomi Altman <naomi at stat.psu.edu> 09/15/07 8:51 PM >>>
Hi Deanne,
Gordon can correct me here if I am wrong, but I think that when there
is no intercept in the design matrix then intraspotCorrelation uses
deviations from 0, rather than deviations from the mean, which is the
wrong type of correlation to account for the array effect.
Besides this, I have noticed (and I think others have noted on this
list) that small changes to microarray data, such as the
normalization method, exclusion of a few arrays, etc, can lead to
large changes in the list of significant genes. On the other hand,
the genes that jump on and off the list are often low expressing
genes. For these genes, the F or t-test denominator is very sensitive
to small changes in the data. Regularization methods such as limma
and SAM damp this somewhat, but in small sample sizes you can still
get sizeable effects. If you use all of the arrays, you are doing a
much different denominator regularization than if you use only a few arrays.
--Naomi
At 08:21 PM 9/15/2007, Deanne Taylor wrote:
>Hi...
>
>I have a few complex analyses I am trying to perform on some older
>data. I've tried looking through the mailing lists and literature
>and perhaps what I have here is a difficulty in understanding. So,
>on to my question...
>
>I have 48 microarrays of two-channel Agilent data, no technical
>replicates or dye swaps, but 3 biological replicates of 16 groups.
>
>For instance, one one chip would be Disease_24h_1mg vs
>Normal_24h_1mg. And so forth...
>
>There need to be comparisons made between the groups of doses and
>time series, which I'm comfortable with (at least conceptually and
>with single-channel data). There is no common reference, and most of
>this data are time series comparing two conditions and doses of
>treatment on diseased and disease-free:
>
>
>Condition Time Treatment
>Disease 24h 1mg
>Normal 24h 1mg
>Disease 48h 1mg
>Normal 48h 1mg
> (...)
>
>Disease 48h 2mg
>Normal 48h 2mg
>Disease 24h 2mg
>Normal 24h 2mg
> (...)
>
>(and so forth)
>
>I would like to make contrasts between diseased at different time
>series (24, 48, 120hours, etc) as well as across doses (1mg, 2mg,
>etc) and compare all of it to some undosed normal controls. I think
>I'm almost there, but I find I'm getting unexpected results.
>
>
>I have taken the following steps which all seem to work. In this
>script, I have left in the control spots in MA because I found some
>odd behavior when I tried to take them out with MA$genes$ControlType ==0
>
>library(limma)
>ID="Targets"
>targets<-readTargets(paste(ID, ".csv",sep=""), sep=",")
>files<-targets$FileName
>RG<-read.maimages(files, source="agilent")
>RG_norm<-normalizeWithinArrays(RG, bc.method="none", method="loess")
>#is this step necessary here?
>MA<-normalizeBetweenArrays(RG_norm, method="Aquantile")
>targets2 <- targetsA2C(targets)
>targets2
>u <- unique(targets2$Target)
>f <- factor(targets2$Target, levels=u)
>design <- model.matrix(~0+f)
>colnames(design) <- u
>corfit <- intraspotCorrelation(MA, design)
>fit <- lmscFit(MA, design, correlation=corfit$consensus)
>#Now, I wanted to check out the first "test" of what I can get with
>a normal contrast between Cy3 and Cy5 on the chips themselves:
>cont.dif<-makeContrasts(Disease_24_1mg - Normal_24_1mg, levels=design)
>fit2<-contrasts.fit(fit, cont.dif)
>fit2<-eBayes(fit2)
>toptable(fit2)
>
>
>Now, what i've just done up there is compare two groups that are on
>the same arrays, but the p-values and gene IDs I get after lmscFit
>are very dissimilar to the ones I get if I just run a simple limma
>analysis across JUST the three two channel chips representing
>Disease_24_1mg/Normal_24_1mg.
>
>I can't understand why there is such a complete difference between
>what one gets with a fit across three two-color microarrays (Cy3 vs
>Cy5) and when comparing the same two channels that were once
>represented on three single chips, together after the single-channel fit.
>
>Is my model wrong? Can someone point me to an answer? Thank you so
>much for your help.
>
>
>
>
>
>---
>Deanne Taylor PhD
>Department of Biostatistics
>Harvard School of Public Health
>655 Huntington Avenue
>Boston, MA 02115
>dtaylor at hsph.harvard.edu
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives:
>http://news.gmane.org/gmane.science.biology.informatics.conductor
Naomi S. Altman 814-865-3791 (voice)
Associate Professor
Dept. of Statistics 814-863-7114 (fax)
Penn State University 814-865-1348 (Statistics)
University Park, PA 16802-2111
More information about the Bioconductor
mailing list