[BioC] Limma with contrasts

Fri Jan 8 20:38:55 CET 2010

Hi Matthew,

Matthew Willmann wrote:
> Dear Jim,
> 
> Thank you for your help.  I tried your suggestion to 
> use write.table(topTable(fit2, coef=1), "group2-group1.txt") for each 
> set of pairs.  I do get a table in each case with the expected columns, 
> however, the table is only showing the results for 10 genes instead of 
> all of the genes (22810).  I must still be doing something wrong.

Yes, you aren't reading the help page.

See ?topTable, and note the default argument for 'number'.

That said, if you just want to write out all results to a file, then you 
want write.fit(), not topTable().

Best,

Jim

> 
> 
> Matthew
> -----------------------------------------------------
> Matthew R. Willmann, Ph.D.
> Research Associate, Poethig Lab
> University of Pennsylvania
> Department of Biology
> 433 S. University Avenue
> Philadelphia, PA 19104
> Lab phone: 215-898-8916
> Cell: 508-243-2495
> Fax: 215-898-8780
> 
> On Jan 8, 2010, at 1:39 PM, James MacDonald wrote:
> Hi Matthew,
> 
> Matthew Willmann wrote:
>> Dear list,
>> I performed a 9 array Affymetrix microarray experiment with three 
>> genotypes and three replicates of each.  I am trying to do GCRMA 
>> normalization, filtering based on MAS5 calls, and then use limma with 
>> contrasts (and BH to adjust the p-values) to identify differentially 
>> expressed genes.  I have no problems with the normalization and 
>> filtering, but I am having trouble with the limma aspect, specifically 
>> generating the results of the contrasts with BH and writing the 
>> results to a table.  In the past, I have only used limma with 
>> experiments involving two genotypes.  I have pasted my R console 
>> session below. The table I get from the below commands has four 
>> columns, one with the affy IDs and one for each of the three 
>> contrasts, but the only entries are 1,0, and -1.  Any advice is much 
>> appreciated.
>> Thank you.
>> Matthew
>> -----------------------------------------------------
>> Matthew R. Willmann, Ph.D.
>> Research Associate, Poethig Lab
>> University of Pennsylvania
>> Department of Biology
>> 433 S. University Avenue
>> Philadelphia, PA 19104
>> Lab phone: 215-898-8916
>> Cell: 508-243-2495
>> Fax: 215-898-8780
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>> dir()
>> [1] "Grandcentral (3864) WT1 ATH1.CEL"            [2] "Grandcentral 
>> (3865) WT2 ATH1.CEL"            [3] "Grandcentral (3866) WT3 ATH1.CEL" 
>>            [4] "Grandcentral (3867) cct1 ATH1.CEL"           [5] 
>> "Grandcentral (3868) cct2 ATH1.CEL"           [6] "Grandcentral (3869) 
>> cct3 ATH1.CEL"           [7] "Grandcentral (3870) gct1 ATH1.CEL" 
>>           [8] "Grandcentral (3871) gct2 ATH1.CEL"           [9] 
>> "Grandcentral (3872) gct3 ATH1.CEL"           [10] "R Console.txt" 
>>                               [11] "R Console2.txt" 
>>                              [12] "R ConsoleFIRST.txt" 
>>                          [13] "StewartarraysLIMMAresults_01072010.txt" 
>>      [14] "StewartarraysLIMMAresults_01072010.txt.fmt"  [15] 
>> "StewartarraysLIMMAresults_01072010_A.txt"    [16] 
>> "StewartarraysLIMMAresults_01072010_B.txt"    [17] 
>> "StewartarraysLIMMAresults_01072010_C.txt"    [18] 
>> "StewartarraysLIMMAresults_01072010_D.txt"    [19] 
>> "StewartarraysLIMMAresults_01072010_E.txt"    [20] 
>> "StewartarraysLIMMAresults_01072010_E.txt.fmt"
>> [21] "csgtest.txt"                                 [22] 
>> "dataStewartgcrma01072010.txt"                
>>> library(affy)
>> Loading required package: Biobase
>> Loading required package: tools
>> Welcome to Bioconductor
>> Vignettes contain introductory material. To view, type
>> 'openVignette()'. To cite Bioconductor, see
>> 'citation("Biobase")' and for packages 'citation(pkgname)'.
>> Loading required package: affyio
>> Loading required package: preprocessCore
>>> library(gcrma)
>> Loading required package: matchprobes
>> Loading required package: splines
>>> library(limma)
>>> data.Stewart=RedAffy(filenames=Cel.files)
>> Error: could not find function "RedAffy"
>>> data.Stewart=ReadAffy(filenames=Cel.files)
>> Error in AllButCelsForReadAffy(..., filenames = filenames, widget = 
>> widget,  :  object "Cel.files" not found
>>> Cel.files=list.files(pattern=".CEL")
>>> Cel.files
>> [1] "Grandcentral (3864) WT1 ATH1.CEL"  "Grandcentral (3865) WT2 
>> ATH1.CEL" [3] "Grandcentral (3866) WT3 ATH1.CEL"  "Grandcentral (3867) 
>> cct1 ATH1.CEL"
>> [5] "Grandcentral (3868) cct2 ATH1.CEL" "Grandcentral (3869) cct3 
>> ATH1.CEL"
>> [7] "Grandcentral (3870) gct1 ATH1.CEL" "Grandcentral (3871) gct2 
>> ATH1.CEL"
>> [9] "Grandcentral (3872) gct3 ATH1.CEL"
>>> data.Stewart=ReadAffy(filenames=Cel.files)
>>> eset<-gcrma(data,fast=F)
>> Error in function (classes, fdef, mtable)  :  unable to find an 
>> inherited method for function "indexProbes", for signature "function", 
>> "character"
>>> eset<-gcrma(data.Stewart,fast=F)
>> Adjusting for optical effect.........Done.
>> Computing affinities.Done.
>> Adjusting for non-specific binding.........Done.
>> Normalizing
>> Calculating Expression
>>> data.pma<-mas5calls(data.Stewart)
>> Getting probe level data...
>> Computing p-values
>> Making P/M/A Calls
>>> data.pma.exprs=exprs(data.pma)
>>> index.9arrays=grep(".CEL",colnames(data.pma.exprs))
>>> numP=apply(data.pma.exprs[,index.9arrays]=="P",1,sum)
>>> gene.select=which(numP!=0)
>>> length(gene.select)
>> [1] 17920
>>> data.wk=eset[gene.select,]
>>> write.table(data.wk,"csgtest08.txt",sep="\t")
> 
> I can't imagine that worked well - unless write.table() has been 
> extended, it should be expecting a data.frame or matrix, not an 
> ExpressionSet.
> 
>>> design <- model.matrix(~ 0+factor(c(1,1,1,2,2,2,3,3,3)))
>>> colnames(design) <- c("group1", "group2", "group3")
>>> fit <- lmFit(eset, design)
>>> contrast.matrix <- makeContrasts(group2-group1, group3-group2, 
>>> group3-group1, levels=design)
>>> fit2 <- contrasts.fit(fit, contrast.matrix)
>>> fit2 <- eBayes(fit2)
>>> results <- decideTests(fit2, method="separate")
>>> objects()
>> [1] "Cel.files"             "affinity.spline.coefs" "contrast.matrix" 
>>      [4] "data.Stewart"          "data.pma" 
>>              "data.pma.exprs"       [7] "data.wk" 
>>               "design"                "eset"                 [10] 
>> "fit"                   "fit2"                 "gene.select" 
>>          [13] "index.9arrays"         "numP" 
>>                  "results"              
>>> write.table(results,"StewartarraysLIMMAresults_01082010.txt",sep="\t",col.names=NA)
> 
> The results object should just be a matrix containing (1, 0, -1). It can 
> be used to make a Venn Diagram, but without further code isn't that 
> useful by itself.
> 
> <self-promotion>
> 
> Above you mention that you want output for each comparison. There are 
> several ways to do that, but IMO, the easiest way is to use either 
> limma2annaffy() or vennSelect() from the affycoretools package. The 
> first function will output either HTML or text (or both) tables with all 
> genes that fulfill p-value (or FDR levels in your case) and/or fold 
> change criteria. All you need to do is feed in some of the objects you 
> have already created.
> 
> If you are interested in the output from a Venn Diagram (try the 
> following first)
> 
> vennDiagram(results)
> 
> where you will get 7 tables containing the genes from each part of the 
> Venn Diagram, then you would use vennSelect().
> 
> </self-promotion>
> 
> Alternatively, since you used the "separate" argument to decideTests(), 
> you could just use
> 
> write.table(topTable(fit2, coef=1), "group2-group1.txt")
> 
> to get e.g., the first comparison, switching the coef (and possibly the 
> p-value and lfc arguments as you wish) to just get the output from limma.
> 
> Best,
> 
> Jim
> 
> 
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch <mailto:Bioconductor at stat.math.ethz.ch>
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the 
>> archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> -- 
> James W. MacDonald, M.S.
> Biostatistician
> Hildebrandt Lab
> 8220D MSRB III
> 1150 W. Medical Center Drive
> Ann Arbor MI 48109-5646
> 734-936-8662
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should not 
> be used for urgent or sensitive issues 

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues