[BioC] Limma: how to combine duplicateCorrelation, dyeeffect and arrayweights?
dorthe.belgardt at medisin.uio.no
dorthe.belgardt at medisin.uio.no
Wed Nov 7 17:21:09 CET 2007
Hi,
I am quite insecure if some parts of the analyis I did in Limma are really
correct and I would highly appreciate if someone could take a look and
give advice. My main concern is that I may not use the
duplicatecorrelation, dyeeffect,arrayweights and spotweights correctly.
The arrays I use are printed in duplicates with a spacing of 15000 (so
30000 features in total), and I did the imageprocessing in GenePixPro6.1.
Thereby I flagged all spots close to backgroundsignal and with a rgn r2
<0.5 bad, and only 30% of my data remain unflagged.
And this is what I did using Limma:
> targets=readTargets("Targets_basicSat.txt")
> targets
SlideNumber FileName Cy3 Cy5
1 1 3096_basicSat.gpr ref A
2 2 3079_basicSat.gpr A ref
3 3 3089_basicSat.gpr ref A
4 4 3081_basicSat.gpr A ref
5 5 3071_basicSat.gpr ref B
6 6 3082_basicSat.gpr B ref
7 7 3085_basicSat.gpr ref B
8 8 8268_basicSat.gpr B ref
9 9 7829_basicSat.gpr ref C
10 10 3086_basicSat.gpr C ref
11 11 7823_basicSat.gpr ref C
12 12 7826_basicSat.gpr C ref
13 13 3090_basicSat.gpr ref D
14 14 3091_basicSat.gpr D ref
15 15 3092_basicSat.gpr ref D
16 16 7827_basicSat.gpr D ref
Every other slide is a dyeswapped technical replicate and per "group"
(A,B,C,D) there are 2 biological replicates.
> K=read.maimages(targets$FileName, source="genepix.median",
wt.fun=wtflags(0))
> types=readSpotTypes("SpottypesGAPDH.txt")
> Status=controlStatus(types, K)
> K$genes$Status=Status
> K3=backgroundCorrect(K, method=normexp, offset=50)
> K3=normalizeWithinArrays(K3, method="median")
> K3a=normalizeBetweenArrays(K3, method="quantile")
> design=modelMatrix(targets, ref="ref")
> design
A B C D
[1,] 1 0 0 0
[2,] -1 0 0 0
[3,] 1 0 0 0
[4,] -1 0 0 0
[5,] 0 1 0 0
[6,] 0 -1 0 0
[7,] 0 1 0 0
[8,] 0 -1 0 0
[9,] 0 0 1 0
[10,] 0 0 -1 0
[11,] 0 0 1 0
[12,] 0 0 -1 0
[13,] 0 0 0 1
[14,] 0 0 0 -1
[15,] 0 0 0 1
[16,] 0 0 0 -1
Since I am expecting a non-negligible dyeeffect I created an other
designmatrix and the following contrastMatrix:
>design1=cbind(DyeEffect=1, design)
>design.cont=makeContrasts("A", B, A-B", levels=design1)
Next I estimate the correlation of within-array-duplicates:
>cor=duplicateCorrelation(K3b, design=design1, ndups=2, spacing=15000,
weights=K3b$weights)
My first question is: is it correct to use here the designmatrix for the
dyeeffect (design1 in this case)?
When fitting the linear model, I also want to use arrayweights, combined
with spotweights. So I gave following commands:
> aw=arrayWeights(K3b, design=design1)
> w=matvec(K3b$weights, aw)
Again the question: is it correct to use here the "design1"-matrix
considering the dyeeffect?
Then I fit the linear model:
>fit=lmFit(K3b, design=design1, ndups=2, spacing=15000, cor=cor$consensus,
weights=w)
>fit1=contrasts.fit(fit, design.cont)
>eb=eBayes(fit1)
Another thing I am worried about is that taking into account the dyeeffect
plus arrayweights plus spotweights might be a bit "too much"? Like in a
way "overtransforming" my data? Especially since approx 70% of my data
have a spotweight of zero. Might it be better to use the spotweight of 0,1
for bad spots, so that I do not loose the data completely?
My apologies for this long email, I tried hard to find out the answers for
myself reading the limmaguide and lots of other documents I found
googleing, but still feel quite "stuck" in my analysis process.
Thanks very much for any kind of help in advance!
Best regards
Dorthe
--
Dorthe Belgardt
Institute of Basic Medical Sciences
Department of Physiology
P.O. Box 1103 Blindern
0317 Oslo
Norway
More information about the Bioconductor
mailing list