[BioC] Limma to find differentially expressed genes
James W. MacDonald
jmacdon at uw.edu
Fri Apr 19 15:04:05 CEST 2013
Hi Sandy,
On 4/19/2013 8:03 AM, Sandy [guest] wrote:
>
> I have my matrix designed in the following way which I name as mat1
>
> Probes sample1 sample1 sample2 sample2 sample3 sample3 sample4 sample4
> rep1 rep2 rep1 rep2 rep1 rep2 rep1 rep2
> ------------------------------------------------------------------------
> gene1 5.098 5.076 5.072 4.677 7.450 7.456 8.564 8.555
> gene2 8.906 8.903 6.700 6.653 6.749 6.754 7.546 7.540
> gene3 7.409 7.398 5.392 5.432 6.715 6.724 5.345 5.330
> gene4 4.876 4.869 5.864 5.981 4.280 4.290 4.267 4.255
> gene4 3.567 3.560 3.554 3.425 8.500 8.564 6.345 6.330
> gene5 2.569 2.560 8.600 8.645 5.225 5.234 7.345 7.333
>
> I use the limma package to find the DEG's
>
> Group<- factor(c("p1", "p1", "p2", "p2","p3", "p4","p4")
> design<- model.matrix(~0 + Group)
> colnames(design)<- gsub("Group","", colnames(design))
> fit<- lmFit(mat1[,1:4],design)
> contrast.matrix<-makeContrasts(p1-p2,levels=design)
> fit2<-contrasts.fit(fit,contrast.matrix)
> fit2<-eBayes(fit2)
> sel.diif<-p.adjust(fit2$F.p.value,method="fdr")<0.05
> deg<-mat1[,1:4][sel.diif,]
>
> So will "deg" just give me those genes which are significant in sample one versus two. I am interested in those genes which are significant only in first sample but not in the second sample and am not sure if this is the right approach.
Two things. First, you shouldn't be digging around in the 'fit2' object
if you don't know what you are doing. If you read the user's guide, you
will see that the correct sequence of events for you would be (after the
eBayes step):
topGenes <- topTable(fit2, coef = 1, p.value = 0.05, number = Inf)
Then if you want the underlying data
deg <- mat[row.names(topGenes),1:4]
Second, I think you might misunderstand a fundamental concept here.
There is no such thing as genes being significant in one sample but not
in the second. A microarray can't really tell you if a gene is expressed
or not. All it can tell you is if a gene appears to be expressed at a
significantly different level in one sample versus another. That is what
you are testing here, whether or not genes in p1 are expressed at a
different level than in p2.
So this raises a question; when you say 'significant only in first
sample but not in the second sample' are you instead asking for genes
that are UP in p1 as compared to p2? If so, you can further filter
results in the topGenes object by the logFC column, selecting only those
with a positive value.
Best,
Jim
>
>
> -- output of sessionInfo():
>
> R version 2.15.2 (2012-10-26)
> Platform: i686-redhat-linux-gnu (32-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=C LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] limma_3.14.4 csSAM_1.2.1 GOstats_2.24.0 RSQLite_0.10.0 DBI_0.2-5
> [6] graph_1.36.2 Category_2.22.0 AnnotationDbi_1.20.5 affy_1.36.1 Biobase_2.16.0
> [11] BiocGenerics_0.4.0 R.utils_1.23.2 R.oo_1.13.0 R.methodsS3_1.4.2
>
> loaded via a namespace (and not attached):
> [1] affyio_1.22.0 annotate_1.36.0 AnnotationForge_1.0.3 BiocInstaller_1.8.3 genefilter_1.40.0
> [6] GO.db_2.8.0 GSEABase_1.18.0 IRanges_1.16.6 parallel_2.15.2 preprocessCore_1.18.0
> [11] RBGL_1.34.0 splines_2.15.2 stats4_2.15.2 survival_2.36-14 tools_2.15.2
> [16] XML_3.9-4 xtable_1.6-0 zlibbioc_1.4.0
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list