[BioC] Extreme weighting values using arrayWeights
James Perkins
jperkins at biochem.ucl.ac.uk
Fri Oct 17 16:40:26 CEST 2008
Hi all,
I am comparing 2 phenotypes, with 4 biological replicates for each
phenotype, using standard affy rat arrays (rat2302 chip). The difference
between phenotypes is quite low and the chips do not cluster well using
hierarchical clustering.
When I compare the phenotypes using limma I get high fdr values. However
when I use array weights I get fdr values orders of magnitude lower
(although generally a similar order of genes) in my toptable genelist
output. I am however a little sceptical, mainly because the array
weights seem quite extreme in terms of how different they are.
The code is as follows:
############################
#################
# with weighting
expM <- exprs(eset_RMA)
ts <- c(rep('Sham',4),rep('VZV',4))
ts <- factor(ts,levels=c('Sham','VZV'))
design <- model.matrix(~0+ts)
colnames(design) <- levels(ts)
arrayw = arrayWeights(eset_RMA,design=design)
cat("Unfiltered arrayweights:",arrayw,"\n")
fit <- lmFit(expM,design,weights=arrayw)
cont.matrix <- makeContrasts(
ShamvsVZV = Sham-VZV,
levels=design)
fit2 <- contrasts.fit(fit, cont.matrix)
fit2 <- eBayes(fit2)
genelist_weight <- topTable(fit2, coef = 1 , number = featureNumber,
adjust="fdr")
write.table(genelist_weight,file='RMA_weight_Res.txt',sep="\t",
quote=FALSE, row.names=FALSE)
write.table(genelist_weight$ID,file='onlyID-Res.txt',sep="\t",
quote=FALSE, row.names=FALSE, col.names=FALSE)
###################
# Without weighting
expM <- exprs(eset_RMA)
ts <- c(rep('Sham',4),rep('VZV',4))
ts <- factor(ts,levels=c('Sham','VZV'))
design <- model.matrix(~0+ts)
colnames(design) <- levels(ts)
fit <- lmFit(expM,design)
cont.matrix <- makeContrasts(
ShamvsVZV = Sham-VZV,
levels=design)
fit2 <- contrasts.fit(fit, cont.matrix)
fit2 <- eBayes(fit2)
genelist <- topTable(fit2, coef = 1 , number = featureNumber, adjust="fdr")
write.table(genelist,file='RMA_Res.txt',sep="\t", quote=FALSE,
row.names=FALSE)
write.table(genelist$ID,file='onlyID-Res.txt',sep="\t", quote=FALSE,
row.names=FALSE, col.names=FALSE)
####################
and the array weights are:
0.5663663 1.011729 1.043341 3.517868 0.5113972 0.3253149 6.825461 0.418734
so for one phenotype the results are: 0.5113972 0.3253149 6.825461 0.418734
Intuitively it seems strange that one array is much more precise than
the other 3, which are way below <1.
This is leading to my skepticism about the results... it seems almost
too good to be true. It also makes me wonder how many discarded arrays
and whole experiments may have benefitted enormously from array weights
in the past.
My question is simply... what are your thoughts on such extreme values?
Does it sound like I've done something wrong in my code? Or perhaps I
should stop complaining and use the weightings.
A small example of my toptable genelists are:
With weights:
ID logFC AveExpr t P.Value adj.P.Val B
1389589_at 0.77457288247666 7.74045139913265
11.0892719288942 4.34668043426807e-07 0.00503577435866138 6.5
94696641953
1369415_at 0.93679260752035 6.76922751086505
10.8782990394042 5.23040979395345e-07 0.00503577435866138 6.4
4372245012463
1368885_at 0.649670459155076 7.92097004624013
10.5687769995562 6.90073898637616e-07 0.00503577435866138 6.2
1525281704988
1368914_at -0.729385569512816 7.87901231082336
-10.4928754755850 7.39376095726585e-07 0.00503577435866138 6.1
5792530874207
1387998_at 0.898658658686192 6.99457560109567
10.3291078255848 8.59321711388731e-07 0.00503577435866138 6.0
3244169393093
1397317_at 1.08780675280058 8.233543987132
10.0026932390152 1.16657933646706e-06 0.00503577435866138
5.774832301
45392
1373771_at 0.86174024105482 6.15057655743681
9.87485379533585 1.31792335557147e-06 0.00503577435866138 5.6
7113815136158
1392736_at 1.02143967541956 7.2540657002723
9.70827346419623 1.548011169664e-06 0.00503577435866138
5.533584205
92699
1370097_a_at 0.694423466771426 7.01711652256997
9.67148255458559 1.60450494629841e-06 0.00503577435866138 5.5
0282664866487
Without weights:
ID logFC AveExpr t P.Value adj.P.Val B
1389589_at 0.763976292801905 7.74045139913265
7.21599644492725 3.96753405748173e-05 0.244892618065609 1.8
8355105071826
1368914_at -0.768154900805746 7.87901231082336
-7.198918687719 4.04430256120508e-05 0.244892618065609 1.870778910
19229
1392736_at 1.00243181553505 7.2540657002723
7.07052168431582 4.67604561943007e-05 0.244892618065609
1.773433205
30607
1390561_at 0.585940649960035 9.77772749089806
7.00089565509309 5.06295373942562e-05 0.244892618065609 1.7
1965842262294
1373416_at 0.748380638598739 7.37833535955067
6.91865859283585 5.56541633075852e-05 0.244892618065609 1.6
5523467357811
1376731_at 0.660349895702276 6.8931169032963
6.69369593102449 7.23913965899781e-05 0.244892618065609
1.473859334
09197
1380128_at 0.669565213058863 8.81473490828032
6.69004109956142 7.27049388279457e-05 0.244892618065609 1.4
7084936117864
1370097_a_at 0.754022396779006 7.01711652256997
6.59976057461934 8.0938008803152e-05 0.244892618065609 1.3
9584414168989
1390398_at 0.761298085052086 6.79062341144723
6.50221189827325 9.09874101750845e-05 0.244892618065609 1.3
1337309014455
1368030_at 1.05560381977689 6.19823184930056
6.45542913940574 9.62809381553364e-05 0.244892618065609 1.2
7328857900046
Any thoughts would be much appreciated.
Kindest regards,
James Perkins
More information about the Bioconductor
mailing list