[BioC] Limma toptable question

Wed Nov 30 18:35:24 CET 2005

The ranking is determined by other values besides fold change (M). For
example, a probeset with high M but also high variance will ranked lower
than one with average M but much less variance. With LIMMA there are a
few other parameters to consider as well.

Regards, Adai

On Wed, 2005-11-30 at 09:31 -0600, Hua Weng wrote:
> Dear Bioconductor:
> 
>  
> 
> There are two reasons that I think slr0146 shouldn't appear in top list:
> 
> 1) The average M value, 0.3261, for slr0146 is less than 0.5
> 
> > MA.norm$M[MA.norm$genes["Name"] == "slr0146",]
> 
> >      slide151120 slide151274-2
> 
> > [1,]      0.3837       -0.2760
> 
> > [2,]      0.3677       -0.2824
> 
> > [3,]      0.3395       -0.2802
> 
> 2) There is only one full weight for slr0146, the other five spots have 0.1
> weight that means the other five spots have very low intensity in both
> channels
> 
> > MA.norm$weights[MA.norm$genes["Name"] == "slr0146",]
> 
> >      slide151120 slide151274-2
> 
> > [1,]         0.1           0.1
> 
> > [2,]         0.1           0.1
> 
> > [3,]         1.0           0.1
> 
>  
> 
> For slr1501 and slr1113, both of them have very high average M value, and
> all the spots for these two probes are full weights that means all these
> spots have strong intensity.
> 
> > MA.norm$M[MA.norm$genes["Name"] == "slr1501",]
> 
> >      slide151120 slide151274-2
> 
> > [1,]       4.977        -4.874
> 
> > [2,]       4.950        -4.155
> 
> > [3,]       4.896        -4.649
> 
> > MA.norm$M[MA.norm$genes["Name"] == "slr1113",]
> 
> >      slide151120 slide151274-2
> 
> > [1,]       2.350        -1.560
> 
> > [2,]       2.258        -1.605
> 
> > [3,]       2.398        -1.725
> 
>  
> 
> Thank you very much for your quick response.
> 
>  
> 
> Hua Weng
> 
>  
> 
> -----Original Message-----
> 
> From: Gordon K Smyth [mailto:smyth at wehi.EDU.AU]
> 
> Sent: Tuesday, November 29, 2005 3:12 PM
> 
> To: Hua Weng
> 
> Subject: Re:Limma Top Table question
> 
>  
> 
> Dear Hua,
> 
>  
> 
> You don't appear to have sent your question to the bioconductor mailing
> list.  I suggest you check the email address.
> 
>  
> 
> See my questions below.
> 
>  
> 
> On Wed, November 30, 2005 7:47 am, Hua Weng wrote:
> 
> > Hello Dr. Smyth and Bioconductor:
> 
> > 
> 
> > I have a question about toptable in limma package in Bioconductor. I 
> 
> > have confusing about the result after I fit linear model to my data. I 
> 
> > have two dye-swap technical replicate slides and three technical 
> 
> > replicates within each slide. I wrote a filter function and gave low 
> 
> > weights to the spots with low intensity in both channels. The source code
> is as following:
> 
> > 
> 
> > library(limma)
> 
> > filter <- function(raw, Rcutoff, Rreplace, Gcutoff, Greplace) { files 
> 
> > <- dir(path=".", pattern=".gpr") Gfsd <- Rfsd <- NULL ngenes <- 
> 
> > length(raw$R[,1]) for(f in files) { skip <-  grep("Block", 
> 
> > readLines(f, n=-1)) - 1
> 
> > print(skip)
> 
> > dat <- read.table(f, sep="           ", as.is=TRUE, skip=skip,
> header=TRUE,
> 
> > quote = '"', comment.char="", check.names = FALSE, nrows = ngenes)
> 
> > Gfsd<-cbind(Gfsd,as.numeric(dat[,"B635 SD"]))
> 
> > Rfsd<-cbind(Rfsd,as.numeric(dat[,"B532 SD"])) } a <- raw$R - raw$Rb b 
> 
> > <- raw$G - raw$Gb e <- raw$R - raw$Rb - 1*Rfsd f <- raw$G - raw$Gb - 
> 
> > 1*Gfsd raw$weights[a<Rcutoff & b<Gcutoff] <- 0.1
> 
> > print(raw$weights[618,])
> 
> > raw$weights[e<Rcutoff & f<Gcutoff] <- 0.1
> 
> > print(raw$weights[618,])
> 
> > raw  }
> 
> > options(digits=4)
> 
> > files <- dir(pattern=".gpr")
> 
> > RG <- read.maimages(files, columns=list(Rf="F532 Mean", Gf="F635 
> 
> > Mean",
> 
> > Rb="B532 Median", Gb="B635 Median"), wt.fun=wtflags(0.1)) RG.filt <- 
> 
> > filter(RG,  200,  200,  200,  200) skip <-  grep("Row", 
> 
> > readLines("slide151120.gpr", n=-1)) - 1 ngenes <- length(RG$R[,1])
> 
> > z <- read.table("slide151120.gpr",sep="   ", as.is=TRUE, skip=skip,
> 
> > header=TRUE, quote = '"', comment.char="", check.names = FALSE, nrows 
> 
> > =
> 
> > ngenes)
> 
> > genes <- data.frame(cbind("Block"=z[,"Block"],
> 
> > "Row"=z[,"Row"],"Column"=z[,"Column"],"Name"=z[,"Name"],"ID"=z[,"ID"])
> 
> > )
> 
> > printer <- getLayout(z)
> 
> > RG.filt$printer <- printer
> 
> > RG.filt$genes <- genes
> 
> > RG$genes <- RG.filt$genes
> 
> > RG$printer <- RG.filt$printer
> 
> > MA.norm <- normalizeWithinArrays(RG.filt, method="printtiploess") 
> 
> > MA.norm <- normalizeBetweenArrays(MA.norm) design <- c(1,-1) i <- 
> 
> > order(genes$Name) geneList <- data.frame(genes$Name[i]) MA.dup <- NULL 
> 
> > MA.dup$M <- MA.norm$M[i, 1:2] MA.dup$A <- MA.norm$A[i, 1:2] 
> 
> > MA.dup$weights <- MA.norm$weights[i, 1:2] fit <- lmFit(MA.dup, design, 
> 
> > weights=MA.norm$weights[i, 1:2], ndups=3) geneList <- 
> 
> > uniquegenelist(geneList,ndups=3) eb <- ebayes(fit) x <- 
> 
> > toptable(number=length(fit$coefficients), genelist=geneList, fit=fit, 
> 
> > A = fit$Amean, eb=eb, adjust="fdr") write.table(x, 
> 
> > file="diff_result.txt", sep="\t")
> 
> > 
> 
> > The code above works without any error. The following are some display 
> 
> > from
> 
> > R:
> 
> > 
> 
> > source("C:/project/Ivy/ma.R")
> 
> > Read slide151120.gpr
> 
> > Read slide151274-2.gpr
> 
> > [1] 31
> 
> > [1] 31
> 
> >   slide151120 slide151274-2
> 
> >             1             1
> 
> >   slide151120 slide151274-2
> 
> >             1             1
> 
> >> x[1:20,]
> 
> >      genes.Name.i.       M       t   P.Value     B
> 
> > 1853       sll1393  1.1650  26.748 0.0008727 5.960
> 
> > 2401       slr0146  0.3261  28.124 0.0008727 5.505
> 
> > 3355       slr1501  4.7499  15.945 0.0099824 4.310
> 
> > 2918       slr0889 -0.4134 -13.855 0.0160775 3.517
> 
> > 2054       sll1663  0.4032  12.534 0.0161267 3.118
> 
> > 1455       sll0800  0.7127  12.443 0.0161267 3.088
> 
> > 2142       sll1774  0.8550  11.913 0.0161267 3.044
> 
> > 2410       slr0161  0.1676  11.751 0.0161267 2.980
> 
> > 968        sll0053  0.3731  11.596 0.0161267 2.976
> 
> > 3071       slr1113  1.9828  11.480 0.0161267 2.870
> 
> > 3337       slr1469  0.6812  11.633 0.0161267 2.649
> 
> > 1337       sll0641 -0.1646 -10.791 0.0205846 2.539
> 
> > 2537       slr0350 -0.2023 -10.450 0.0206605 2.443
> 
> > 3073       slr1115  0.9906  10.362 0.0206605 2.347
> 
> > 4021       ssr1736  0.2674   9.721 0.0240955 2.039
> 
> > 3338       slr1470  0.6108   9.941 0.0238879 2.004
> 
> > 2483       slr0273  0.2555   9.604 0.0240955 1.980
> 
> > 1453       sll0797  0.3010   9.428 0.0240955 1.890
> 
> > 3069       slr1109 -0.2019  -9.452 0.0240955 1.867
> 
> > 3850       smr0003  0.4657   9.428 0.0240955 1.856
> 
> > MA.norm$M[MA.norm$genes["Name"] == "slr1501",]
> 
> >      slide151120 slide151274-2
> 
> > [1,]       4.977        -4.874
> 
> > [2,]       4.950        -4.155
> 
> > [3,]       4.896        -4.649
> 
> > MA.norm$M[MA.norm$genes["Name"] == "slr0146",]
> 
> >      slide151120 slide151274-2
> 
> > [1,]      0.3837       -0.2760
> 
> > [2,]      0.3677       -0.2824
> 
> > [3,]      0.3395       -0.2802
> 
> > MA.norm$M[MA.norm$genes["Name"] == "sll1393",]
> 
> >      slide151120 slide151274-2
> 
> > [1,]       1.114        -1.242
> 
> > [2,]       1.143        -1.223
> 
> > [3,]       1.118        -1.150
> 
> > MA.norm$M[MA.norm$genes["Name"] == "slr1113",]
> 
> >      slide151120 slide151274-2
> 
> > [1,]       2.350        -1.560
> 
> > [2,]       2.258        -1.605
> 
> > [3,]       2.398        -1.725
> 
> >> MA.norm$weights[MA.norm$genes["Name"] == "slr1113",]
> 
> >      slide151120 slide151274-2
> 
> > [1,]           1             1
> 
> > [2,]           1             1
> 
> > [3,]           1             1
> 
> > MA.norm$weights[MA.norm$genes["Name"] == "sll1393",]
> 
> >      slide151120 slide151274-2
> 
> > [1,]           1             1
> 
> > [2,]           1             1
> 
> > [3,]           1             1
> 
> > MA.norm$weights[MA.norm$genes["Name"] == "slr0146",]
> 
> >      slide151120 slide151274-2
> 
> > [1,]         0.1           0.1
> 
> > [2,]         0.1           0.1
> 
> > [3,]         1.0           0.1
> 
> > MA.norm$weights[MA.norm$genes["Name"] == "slr1501",]
> 
> >      slide151120 slide151274-2
> 
> > [1,]           1             1
> 
> > [2,]           1             1
> 
> > [3,]           1             1
> 
> > 
> 
> > I am confusing about the output from toptable after I fit ebays 
> 
> > function. I think slr0146 with average M value 0.3261 shouldn't appear 
> 
> > in top list but it does show up as the second significantly differential
> expressed gene.
> 
>  
> 
> Why do you think that slr0146 should not appear?
> 
>  
> 
> > And
> 
> > gene slr1501 should appear as the first and slr1113 should appear in 
> 
> > top five but it doesn't.
> 
>  
> 
> Again, what do you think this?
> 
>  
> 
> Gordon
> 
>  
> 
> > Could you be kind enough to tell me what's wrong in my code?
> 
> > 
> 
> > Thanks in advance for your time.
> 
> > 
> 
> > Best Regards,
> 
> > Hua Weng
> 
> > 
> 
> > Scientific Programmer
> 
> > Microarray Core Facility
> 
> > Oklahoma State University
> 
> > Department of Biochemistry and Molecular Biology
> 
> > 246 Noble Research Center
> 
> > Stillwater, OK  74078
> 
> > hweng at biochem.okstate.edu
> 
> > (405) 744-6209
> 
> > FAX (405) 744-7799
> 
> > http://microarray.okstate.edu/
> 
>  
> 
> 
> 
> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>