[BioC] Limma toptable question
Adaikalavan Ramasamy
ramasamy at cancer.org.uk
Wed Nov 30 18:35:24 CET 2005
The ranking is determined by other values besides fold change (M). For
example, a probeset with high M but also high variance will ranked lower
than one with average M but much less variance. With LIMMA there are a
few other parameters to consider as well.
Regards, Adai
On Wed, 2005-11-30 at 09:31 -0600, Hua Weng wrote:
> Dear Bioconductor:
>
>
>
> There are two reasons that I think slr0146 shouldn't appear in top list:
>
> 1) The average M value, 0.3261, for slr0146 is less than 0.5
>
> > MA.norm$M[MA.norm$genes["Name"] == "slr0146",]
>
> > slide151120 slide151274-2
>
> > [1,] 0.3837 -0.2760
>
> > [2,] 0.3677 -0.2824
>
> > [3,] 0.3395 -0.2802
>
> 2) There is only one full weight for slr0146, the other five spots have 0.1
> weight that means the other five spots have very low intensity in both
> channels
>
> > MA.norm$weights[MA.norm$genes["Name"] == "slr0146",]
>
> > slide151120 slide151274-2
>
> > [1,] 0.1 0.1
>
> > [2,] 0.1 0.1
>
> > [3,] 1.0 0.1
>
>
>
> For slr1501 and slr1113, both of them have very high average M value, and
> all the spots for these two probes are full weights that means all these
> spots have strong intensity.
>
> > MA.norm$M[MA.norm$genes["Name"] == "slr1501",]
>
> > slide151120 slide151274-2
>
> > [1,] 4.977 -4.874
>
> > [2,] 4.950 -4.155
>
> > [3,] 4.896 -4.649
>
> > MA.norm$M[MA.norm$genes["Name"] == "slr1113",]
>
> > slide151120 slide151274-2
>
> > [1,] 2.350 -1.560
>
> > [2,] 2.258 -1.605
>
> > [3,] 2.398 -1.725
>
>
>
> Thank you very much for your quick response.
>
>
>
> Hua Weng
>
>
>
> -----Original Message-----
>
> From: Gordon K Smyth [mailto:smyth at wehi.EDU.AU]
>
> Sent: Tuesday, November 29, 2005 3:12 PM
>
> To: Hua Weng
>
> Subject: Re:Limma Top Table question
>
>
>
> Dear Hua,
>
>
>
> You don't appear to have sent your question to the bioconductor mailing
> list. I suggest you check the email address.
>
>
>
> See my questions below.
>
>
>
> On Wed, November 30, 2005 7:47 am, Hua Weng wrote:
>
> > Hello Dr. Smyth and Bioconductor:
>
> >
>
> > I have a question about toptable in limma package in Bioconductor. I
>
> > have confusing about the result after I fit linear model to my data. I
>
> > have two dye-swap technical replicate slides and three technical
>
> > replicates within each slide. I wrote a filter function and gave low
>
> > weights to the spots with low intensity in both channels. The source code
> is as following:
>
> >
>
> > library(limma)
>
> > filter <- function(raw, Rcutoff, Rreplace, Gcutoff, Greplace) { files
>
> > <- dir(path=".", pattern=".gpr") Gfsd <- Rfsd <- NULL ngenes <-
>
> > length(raw$R[,1]) for(f in files) { skip <- grep("Block",
>
> > readLines(f, n=-1)) - 1
>
> > print(skip)
>
> > dat <- read.table(f, sep=" ", as.is=TRUE, skip=skip,
> header=TRUE,
>
> > quote = '"', comment.char="", check.names = FALSE, nrows = ngenes)
>
> > Gfsd<-cbind(Gfsd,as.numeric(dat[,"B635 SD"]))
>
> > Rfsd<-cbind(Rfsd,as.numeric(dat[,"B532 SD"])) } a <- raw$R - raw$Rb b
>
> > <- raw$G - raw$Gb e <- raw$R - raw$Rb - 1*Rfsd f <- raw$G - raw$Gb -
>
> > 1*Gfsd raw$weights[a<Rcutoff & b<Gcutoff] <- 0.1
>
> > print(raw$weights[618,])
>
> > raw$weights[e<Rcutoff & f<Gcutoff] <- 0.1
>
> > print(raw$weights[618,])
>
> > raw }
>
> > options(digits=4)
>
> > files <- dir(pattern=".gpr")
>
> > RG <- read.maimages(files, columns=list(Rf="F532 Mean", Gf="F635
>
> > Mean",
>
> > Rb="B532 Median", Gb="B635 Median"), wt.fun=wtflags(0.1)) RG.filt <-
>
> > filter(RG, 200, 200, 200, 200) skip <- grep("Row",
>
> > readLines("slide151120.gpr", n=-1)) - 1 ngenes <- length(RG$R[,1])
>
> > z <- read.table("slide151120.gpr",sep=" ", as.is=TRUE, skip=skip,
>
> > header=TRUE, quote = '"', comment.char="", check.names = FALSE, nrows
>
> > =
>
> > ngenes)
>
> > genes <- data.frame(cbind("Block"=z[,"Block"],
>
> > "Row"=z[,"Row"],"Column"=z[,"Column"],"Name"=z[,"Name"],"ID"=z[,"ID"])
>
> > )
>
> > printer <- getLayout(z)
>
> > RG.filt$printer <- printer
>
> > RG.filt$genes <- genes
>
> > RG$genes <- RG.filt$genes
>
> > RG$printer <- RG.filt$printer
>
> > MA.norm <- normalizeWithinArrays(RG.filt, method="printtiploess")
>
> > MA.norm <- normalizeBetweenArrays(MA.norm) design <- c(1,-1) i <-
>
> > order(genes$Name) geneList <- data.frame(genes$Name[i]) MA.dup <- NULL
>
> > MA.dup$M <- MA.norm$M[i, 1:2] MA.dup$A <- MA.norm$A[i, 1:2]
>
> > MA.dup$weights <- MA.norm$weights[i, 1:2] fit <- lmFit(MA.dup, design,
>
> > weights=MA.norm$weights[i, 1:2], ndups=3) geneList <-
>
> > uniquegenelist(geneList,ndups=3) eb <- ebayes(fit) x <-
>
> > toptable(number=length(fit$coefficients), genelist=geneList, fit=fit,
>
> > A = fit$Amean, eb=eb, adjust="fdr") write.table(x,
>
> > file="diff_result.txt", sep="\t")
>
> >
>
> > The code above works without any error. The following are some display
>
> > from
>
> > R:
>
> >
>
> > source("C:/project/Ivy/ma.R")
>
> > Read slide151120.gpr
>
> > Read slide151274-2.gpr
>
> > [1] 31
>
> > [1] 31
>
> > slide151120 slide151274-2
>
> > 1 1
>
> > slide151120 slide151274-2
>
> > 1 1
>
> >> x[1:20,]
>
> > genes.Name.i. M t P.Value B
>
> > 1853 sll1393 1.1650 26.748 0.0008727 5.960
>
> > 2401 slr0146 0.3261 28.124 0.0008727 5.505
>
> > 3355 slr1501 4.7499 15.945 0.0099824 4.310
>
> > 2918 slr0889 -0.4134 -13.855 0.0160775 3.517
>
> > 2054 sll1663 0.4032 12.534 0.0161267 3.118
>
> > 1455 sll0800 0.7127 12.443 0.0161267 3.088
>
> > 2142 sll1774 0.8550 11.913 0.0161267 3.044
>
> > 2410 slr0161 0.1676 11.751 0.0161267 2.980
>
> > 968 sll0053 0.3731 11.596 0.0161267 2.976
>
> > 3071 slr1113 1.9828 11.480 0.0161267 2.870
>
> > 3337 slr1469 0.6812 11.633 0.0161267 2.649
>
> > 1337 sll0641 -0.1646 -10.791 0.0205846 2.539
>
> > 2537 slr0350 -0.2023 -10.450 0.0206605 2.443
>
> > 3073 slr1115 0.9906 10.362 0.0206605 2.347
>
> > 4021 ssr1736 0.2674 9.721 0.0240955 2.039
>
> > 3338 slr1470 0.6108 9.941 0.0238879 2.004
>
> > 2483 slr0273 0.2555 9.604 0.0240955 1.980
>
> > 1453 sll0797 0.3010 9.428 0.0240955 1.890
>
> > 3069 slr1109 -0.2019 -9.452 0.0240955 1.867
>
> > 3850 smr0003 0.4657 9.428 0.0240955 1.856
>
> > MA.norm$M[MA.norm$genes["Name"] == "slr1501",]
>
> > slide151120 slide151274-2
>
> > [1,] 4.977 -4.874
>
> > [2,] 4.950 -4.155
>
> > [3,] 4.896 -4.649
>
> > MA.norm$M[MA.norm$genes["Name"] == "slr0146",]
>
> > slide151120 slide151274-2
>
> > [1,] 0.3837 -0.2760
>
> > [2,] 0.3677 -0.2824
>
> > [3,] 0.3395 -0.2802
>
> > MA.norm$M[MA.norm$genes["Name"] == "sll1393",]
>
> > slide151120 slide151274-2
>
> > [1,] 1.114 -1.242
>
> > [2,] 1.143 -1.223
>
> > [3,] 1.118 -1.150
>
> > MA.norm$M[MA.norm$genes["Name"] == "slr1113",]
>
> > slide151120 slide151274-2
>
> > [1,] 2.350 -1.560
>
> > [2,] 2.258 -1.605
>
> > [3,] 2.398 -1.725
>
> >> MA.norm$weights[MA.norm$genes["Name"] == "slr1113",]
>
> > slide151120 slide151274-2
>
> > [1,] 1 1
>
> > [2,] 1 1
>
> > [3,] 1 1
>
> > MA.norm$weights[MA.norm$genes["Name"] == "sll1393",]
>
> > slide151120 slide151274-2
>
> > [1,] 1 1
>
> > [2,] 1 1
>
> > [3,] 1 1
>
> > MA.norm$weights[MA.norm$genes["Name"] == "slr0146",]
>
> > slide151120 slide151274-2
>
> > [1,] 0.1 0.1
>
> > [2,] 0.1 0.1
>
> > [3,] 1.0 0.1
>
> > MA.norm$weights[MA.norm$genes["Name"] == "slr1501",]
>
> > slide151120 slide151274-2
>
> > [1,] 1 1
>
> > [2,] 1 1
>
> > [3,] 1 1
>
> >
>
> > I am confusing about the output from toptable after I fit ebays
>
> > function. I think slr0146 with average M value 0.3261 shouldn't appear
>
> > in top list but it does show up as the second significantly differential
> expressed gene.
>
>
>
> Why do you think that slr0146 should not appear?
>
>
>
> > And
>
> > gene slr1501 should appear as the first and slr1113 should appear in
>
> > top five but it doesn't.
>
>
>
> Again, what do you think this?
>
>
>
> Gordon
>
>
>
> > Could you be kind enough to tell me what's wrong in my code?
>
> >
>
> > Thanks in advance for your time.
>
> >
>
> > Best Regards,
>
> > Hua Weng
>
> >
>
> > Scientific Programmer
>
> > Microarray Core Facility
>
> > Oklahoma State University
>
> > Department of Biochemistry and Molecular Biology
>
> > 246 Noble Research Center
>
> > Stillwater, OK 74078
>
> > hweng at biochem.okstate.edu
>
> > (405) 744-6209
>
> > FAX (405) 744-7799
>
> > http://microarray.okstate.edu/
>
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
More information about the Bioconductor
mailing list