[BioC] Extreme weighting values using arrayWeights

Matt Ritchie mer36 at cam.ac.uk
Fri Oct 24 18:06:17 CEST 2008


Dear James,

The code in your email looks fine to me. The improved results you observe 
(i.e. more power to detect differential expression) is also a good 
indicator that you have indeed down-weighted the less reliable arrays in 
your weighted analysis (if things went the other way I would be a bit 
worried).

Best wishes,

Matt

>Hi all,
>
>I am comparing 2 phenotypes, with 4 biological replicates for each 
>phenotype, using standard affy rat arrays (rat2302 chip). The difference 
>between phenotypes is quite low and the chips do not cluster well using 
>hierarchical clustering.
>
>When I compare the phenotypes using limma I get high fdr values. However 
>when I use array weights I get fdr values orders of magnitude lower 
>(although generally a similar order of genes) in my toptable genelist 
>output. I am however a little sceptical, mainly because the array 
>weights seem quite extreme in terms of how different they are.
>
>The code is as follows:
>
>############################
>
>#################
># with weighting
>
>expM <- exprs(eset_RMA)
>ts <- c(rep('Sham',4),rep('VZV',4))
>ts <- factor(ts,levels=c('Sham','VZV'))
>design <- model.matrix(~0+ts)
>colnames(design) <- levels(ts)
>
>arrayw = arrayWeights(eset_RMA,design=design)
>cat("Unfiltered arrayweights:",arrayw,"\n")
>
>fit <- lmFit(expM,design,weights=arrayw)
>cont.matrix <- makeContrasts(
>               ShamvsVZV = Sham-VZV,
>               levels=design)
>fit2 <- contrasts.fit(fit, cont.matrix)
>fit2 <- eBayes(fit2)
>genelist_weight <- topTable(fit2, coef = 1 , number = featureNumber, 
>adjust="fdr")
>
>write.table(genelist_weight,file='RMA_weight_Res.txt',sep="\t", 
>quote=FALSE, row.names=FALSE)
>write.table(genelist_weight$ID,file='onlyID-Res.txt',sep="\t", 
>quote=FALSE, row.names=FALSE, col.names=FALSE)
>
>###################
># Without weighting
>
>expM <- exprs(eset_RMA)
>ts <- c(rep('Sham',4),rep('VZV',4))
>ts <- factor(ts,levels=c('Sham','VZV'))
>design <- model.matrix(~0+ts)
>colnames(design) <- levels(ts)
>fit <- lmFit(expM,design)
>cont.matrix <- makeContrasts(
>               ShamvsVZV = Sham-VZV,
>               levels=design)
>fit2 <- contrasts.fit(fit, cont.matrix)
>fit2 <- eBayes(fit2)
>genelist <- topTable(fit2, coef = 1 , number = featureNumber, adjust="fdr")
>
>write.table(genelist,file='RMA_Res.txt',sep="\t", quote=FALSE, 
>row.names=FALSE)
>write.table(genelist$ID,file='onlyID-Res.txt',sep="\t", quote=FALSE, 
>row.names=FALSE, col.names=FALSE)
>
>####################
>
>and the array weights are:
>0.5663663 1.011729 1.043341 3.517868 0.5113972 0.3253149 6.825461 0.418734
>
>so for one phenotype the results are: 0.5113972 0.3253149 6.825461 0.418734
>
>Intuitively it seems strange that one array is much more precise than 
>the other 3, which are way below <1.
>
>This is leading to my skepticism about the results... it seems almost 
>too good to be true. It also makes me wonder how many discarded arrays 
>and whole experiments may have benefitted enormously from array weights 
>in the past.
>
>My question is simply... what are your thoughts on such extreme values? 
>Does it sound like I've done something wrong in my code? Or perhaps I 
>should stop complaining and  use the weightings.
>
>
>A small example of my toptable genelists are:
>
>With weights:
>
>ID      logFC   AveExpr t       P.Value adj.P.Val       B
>1389589_at      0.77457288247666        7.74045139913265        
>11.0892719288942        4.34668043426807e-07    0.00503577435866138     6.5
>94696641953
>1369415_at      0.93679260752035        6.76922751086505        
>10.8782990394042        5.23040979395345e-07    0.00503577435866138     6.4
>4372245012463
>1368885_at      0.649670459155076       7.92097004624013        
>10.5687769995562        6.90073898637616e-07    0.00503577435866138     6.2
>1525281704988
>1368914_at      -0.729385569512816      7.87901231082336        
>-10.4928754755850       7.39376095726585e-07    0.00503577435866138     6.1
>5792530874207
>1387998_at      0.898658658686192       6.99457560109567        
>10.3291078255848        8.59321711388731e-07    0.00503577435866138     6.0
>3244169393093
>1397317_at      1.08780675280058        8.233543987132  
>10.0026932390152        1.16657933646706e-06    0.00503577435866138     
>5.774832301
>45392
>1373771_at      0.86174024105482        6.15057655743681        
>9.87485379533585        1.31792335557147e-06    0.00503577435866138     5.6
>7113815136158
>1392736_at      1.02143967541956        7.2540657002723 
>9.70827346419623        1.548011169664e-06      0.00503577435866138     
>5.533584205
>92699
>1370097_a_at    0.694423466771426       7.01711652256997        
>9.67148255458559        1.60450494629841e-06    0.00503577435866138     5.5
>0282664866487
>
>Without weights:
>
>ID      logFC   AveExpr t       P.Value adj.P.Val       B
>1389589_at      0.763976292801905       7.74045139913265        
>7.21599644492725        3.96753405748173e-05    0.244892618065609       1.8
>8355105071826
>1368914_at      -0.768154900805746      7.87901231082336        
>-7.198918687719 4.04430256120508e-05    0.244892618065609       1.870778910
>19229
>1392736_at      1.00243181553505        7.2540657002723 
>7.07052168431582        4.67604561943007e-05    0.244892618065609       
>1.773433205
>30607
>1390561_at      0.585940649960035       9.77772749089806        
>7.00089565509309        5.06295373942562e-05    0.244892618065609       1.7
>1965842262294
>1373416_at      0.748380638598739       7.37833535955067        
>6.91865859283585        5.56541633075852e-05    0.244892618065609       1.6
>5523467357811
>1376731_at      0.660349895702276       6.8931169032963 
>6.69369593102449        7.23913965899781e-05    0.244892618065609       
>1.473859334
>09197
>1380128_at      0.669565213058863       8.81473490828032        
>6.69004109956142        7.27049388279457e-05    0.244892618065609       1.4
>7084936117864
>1370097_a_at    0.754022396779006       7.01711652256997        
>6.59976057461934        8.0938008803152e-05     0.244892618065609       1.3
>9584414168989
>1390398_at      0.761298085052086       6.79062341144723        
>6.50221189827325        9.09874101750845e-05    0.244892618065609       1.3
>1337309014455
>1368030_at      1.05560381977689        6.19823184930056        
>6.45542913940574        9.62809381553364e-05    0.244892618065609       1.2
>7328857900046
>
>
>Any thoughts would be much appreciated.
>
>Kindest regards,
>
>James Perkins



More information about the Bioconductor mailing list