[BioC] DESeq2, problems using list argument to extract results from DESeq
Jon Bråte
jon.brate at ibv.uio.no
Wed Jul 30 20:23:23 CEST 2014
Simon Anders <anders at ...> writes:
>
> Hi Jon
>
> On 11/07/14 10:36, Jon Bråte [guest] wrote:
> > I have gene expression from several embryogenesis stages as well as
non-reproductive stages. I have done
> DESeq on my DESeqDataSet and want to extract the results from all the
embryogenesis stages vs non-reproductive.
> >
> > Heres the error I get:
> >
> >> resNonReprVsAll = results(dds, contrast=list(c("Early vitellogenesis",
"Late vitellogenesis",
> "Fertilization", "Early cleavage", "Late cleavage", "Early preinversion",
"Late preinversion",
> "Early postinversion", "Late postinversion"), "Non reproductive"))
> > Error in cleanContrast(object, contrast, expanded = isExpanded,
listValues = listValues) :
> > all elements of the contrast as a list of length 2 should be elements
of 'resultsNames(object)'
>
> The 'results' function gives you a results table for _one_ contrast. If
> you want several contrasts, you have to call 'results' several times.
>
> Also note that 'results' now supports two ways of specifying contrasts:
> For main effects, we suggest that you use the new format of passing a
> vector with three strings, namely the factor which you want to contrast
> and then the two levels you want to compare, e.g.,
>
> results( dds, contrast = c( "condition",
> "Fertilization", "Non reproductive" ) )
> # (or perhaps "Non.reproductive", depending how your level is
> # actually names)
>
> The old way of specifying _two_ elements from 'resultsNames' needs to be
> used only for more complicated contrasts, e.g., those involving
> interactions.
>
> If this is unclear, ask again, and explain the biology of your
> experiment and what contrasts you are actually interested in.
>
> Simon
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at ...
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
Dear Simon,
thank you for your answer. What I am interested in is to find genes that
possibly play a part in embryogenesis. In other words, genes that are
up-regulated in any of the embryogenesis stages compared to the
non-reproductive stages.
I have tried three different approaches, but struggle to understand what
they mean, how they differ, and if they actually make sense. I try to
explain what I did, sorry if it is difficult to read:
First I tried to contrast all the embryogenesis stages compared to the non
reproductive using the Wald test in DESeq:
dds = DESeq(dds)
resCtrst = results(dds, contrast=list(c("conditionEarly.vitellogenesis",
"conditionLate.vitellogenesis", "conditionFertilization",
"conditionEarly.cleavage",
"conditionLate.cleavage",
"conditionEarly.preinversion", "conditionLate.preinversion",
"conditionEarly.postinversion",
"conditionLate.postinversion"), "conditionNon.reproductive"))
resCtrst = resCtrst[order(resCtrst$padj),]
resSigCtrst = resCtrst[which(resCtrst$padj < 0.1), ]
resSigCtrstPositive = resSigCtrst[which(resSigCtrst$log2FoldChange > 0), ]
write.csv2(resSigCtrstPositive,
file="DESeq2_Wald_test_Sig_upreg_all_cond_vs_non_repr.csv")
This gave me 1736 upregulated genes.
Second I ran an LTR test like this:
ddsLRT = DESeq(dds, test="LRT", full=~condition, reduced=~1)
resLRT = results(ddsLRT)
resLRT = resLRT[order(resLRT$padj),]
resSigLRT = resLRT[which(resLRT$padj < 0.1), ]
resSigLRTPositive = resSigLRT[which(resSigLRT$log2FoldChange > 0), ]
write.csv2(resSigLRT, file="DESeq2_Sig_LRT_upreg.csv")
This gave me 3158 upregulated genes (all genes have positive log2fold
changes so I guess I cannot see which are up or down regulated?)
And I don't understand exactly what is contained in resLRT, are the
significant genes here significantly expressed between Non reproductive
samples and all others compared to the null hypothesis? Or is it just for
the two conditions Late.postinversion vs Non reproductive?:
mcols(resLRT)
DataFrame with 6 rows and 2 columns
type
description
<character>
<character>
1 intermediate the base mean over all
rows
2 results log2 fold change: condition Late.postinversion vs
Non.reproductive
3 results standard error: condition Late.postinversion vs
Non.reproductive
4 results LRT statistic: '~ condition' vs
'~ 1'
5 results LRT p-value: '~ condition' vs
'~ 1'
6 results BH adjusted
p-values
Third I tried to compare the Wald test and the LTR:
res = results(dds) #should I use resCtrst instead of res??
tab = table(Wald=res$padj < .1, LRT=resLRT$padj < .1)
addmargins(tab)
LRT
Wald FALSE TRUE Sum
FALSE 1027 1766 2793
TRUE 526 747 1273
Sum 1553 2513 4066
sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
base
other attached packages:
[1] DESeq2_1.4.5 RcppArmadillo_0.4.320.0 Rcpp_0.11.2
GenomicRanges_1.16.3
[5] GenomeInfoDb_1.0.2 IRanges_1.22.9 BiocGenerics_0.10.0
loaded via a namespace (and not attached):
[1] AnnotationDbi_1.26.0 Biobase_2.24.0 DBI_0.2-7
RColorBrewer_1.0-5 RSQLite_0.11.4
[6] XML_3.98-1.1 XVector_0.4.0 annotate_1.42.0
genefilter_1.46.1 geneplotter_1.42.0
[11] grid_3.1.0 lattice_0.20-29 locfit_1.5-9.1
splines_3.1.0 stats4_3.1.0
[16] survival_2.37-7 tools_3.1.0 xtable_1.7-3
More information about the Bioconductor
mailing list