[Bioc-devel] R: Re: Re: Deseq2 and differentia expression
jarod_v6 at libero.it
jarod_v6 at libero.it
Fri Jul 11 16:55:06 CEST 2014
Dear All I don't understand. I want to extract only genes have a fold-change
more than 0.5
I use this comand but I have all the genes inside:
resGA2 <- results(dds, lfcThreshold=.5, altHypothesis="greaterAbs") #greater
dim(resGA)
>> [1] 62893 6
>>> dim(resGA2)
>> [1] 62893 6
>>> dim(resGA3)
>> [1] 62893 6
>>
how can extract the names of genes are in resGA2?
thanks for the patience!
j.
>----Messaggio originale----
>Da: michaelisaiahlove at gmail.com
>Data: 11/07/2014 15.15
>A: "jarod_v6 at libero.it"<jarod_v6 at libero.it>
>Cc: "bioc-devel at r-project.org"<bioc-devel at r-project.org>
>Ogg: Re: Re: [Bioc-devel] Deseq2 and differentia expression
>
>hi Jarod,
>
>This is more of a main Bioc mailing list question, so you can address
>future questions there.
>
>On Fri, Jul 11, 2014 at 6:05 AM, jarod_v6 at libero.it <jarod_v6 at libero.it>
wrote:
>> Dear Dr,
>> Thanks so much for clarification!!!
>> So I try the test of log fold change but I'm bit confusion on the results:
>> If I interested in the genes that have a foldchange more than 0.5 and 2 I
need
>> to use this comand is it right?
>
>the second and third results() commands below give you this.
>
>> ddsNoPrior <- DESeq(ddHTSeq, betaPrior=FALSE) #only for lessABs
>>
>> resGA <- results(ddsNoPrior, lfcThreshold=.5, altHypothesis="lessAbs")
>> #greater tdi
>> resGA2 <- results(dds, lfcThreshold=.5, altHypothesis="greaterAbs")
#greater
>> tdi
>> resGA3 <- results(dds, lfcThreshold=2, altHypothesis="greaterAbs") #greater
>> tdi
>>
>> dim(resGA)
>> [1] 62893 6
>>> dim(resGA2)
>> [1] 62893 6
>>> dim(resGA3)
>> [1] 62893 6
>>
>> The number of gene select it is always the same.. Where is my mistake!
>> thanks in advance!
>>
>
>DESeq2 returns the results for all the genes in the same order as the
>original object. You need to specify a threshold on adjusted p-value.
>
>table(res$padj < 0.1)
>
>You can use subset(res, padj < 0.1) to filter the DataFrame.
>
>>
>>>----Messaggio originale----
>>>Da: michaelisaiahlove at gmail.com
>>>Data: 10/07/2014 14.46
>>>A: "jarod_v6 at libero.it"<jarod_v6 at libero.it>
>>>Cc: "bioc-devel at r-project.org"<bioc-devel at r-project.org>
>>>Ogg: Re: [Bioc-devel] Deseq2 and differentia expression
>>>
>>>hi Jarod,
>>>
>>>On Thu, Jul 10, 2014 at 7:59 AM, jarod_v6 at libero.it <jarod_v6 at libero.it>
>> wrote:
>>>> Hi there!!!
>>>>
>>>> I have did this code:
>>>> SampleTable <-data.frame(SampleName=metadata$ID_CLINICO,
>> fileName=metadata$NOME,
>>>> condition=metadata$CONDITION,prim=metadata$CDT)
>>>> ddHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable=SampleTable,directory="
>>>> Count/", design= ~condition) # effetto dello mutazione
>>>> ddHTSeq$condition <- relevel(ddHTSeq$condition, "NVI")# quindi verso non
>>>> viscerali
>>>> dds <- DESeq(ddHTSeq)
>>>> res <-results(dds)
>>>>
>>>> resOrdered <- res[order(res$padj),]
>>>> head(resOrdered)
>>>> ResSig <- res[ which(res$padj < 0.1 ), ]
>>>>
>>>>
>>>> I want to select some data. How can I do? which is the good cut-off on
FDR
>>>> values?
>>>
>>>The code above does the selection on adjusted p-value. The right FDR
>>>cutoff is up to you, what percent of false discoveries is tolerable in
>>>the final list of genes? The considerations are: the cost of
>>>validation or following up on a false discovery, versus the cost of a
>>>missed discovery. These are hard to quantify even if you know all the
>>>details of an experiment.
>>>
>>>> All the data have a FDR less thank 0.1 . :
>>>> Is it right this comand?
>>>> res[ which(res$padj < 0.1 ), ]
>>>>
>>>
>>>yes. The which() is necessary because some of the res$padj have NA. If
>>>you have a logical vector with NA, you cannot directly index a
>>>DataFrame, but you can index after calling which(), which will return
>>>the numeric index of the TRUE's. You could also subset with:
>>>subset(res, padj < 0.1).
>>>
>>>The reason for the NAs is explained in the vignette: "Note that some
>>>values in the results table can be set to NA, for either one of the
>>>following reasons:..."
>>>
>>>
>>>> How many significant genes are with FDR less than 0.1 and have an
absolute
>>>> value of foldchange more of 1 ? I have and error on this. I have many
NA
>>>> values.
>>>>
>>>> If I try this code I have the follow errors
>>>>> significant.genes = res[(res$padj < .05 & abs(res$log2FoldChange) >= 1
),]
>> #
>>>> Set thethreshold for the log2 fold change.
>>>> Error in normalizeSingleBracketSubscript(i, x, byrow = TRUE, exact =
FALSE)
>> :
>>>> subscript contains NAs
>>>>
>>>
>>>This is not the recommended way to filter on large log fold changes.
>>>We have implemented a test specifically for this, check the vignette
>>>section on "Tests of log2 fold change above or below a threshold"
>>>
>>>Mike
>>>
>>>> How can I resolve this problenms?
>>>> thanks in advance for the help
>>>>
>>>>
>>>>
>>>> R version 3.1.0 (2014-04-10)
>>>> Platform: i686-pc-linux-gnu (32-bit)
>>>>
>>>> locale:
>>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>>>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
>>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
>>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>>
>>>> attached base packages:
>>>> [1] splines parallel stats graphics grDevices utils datasets
>>>> [8] methods base
>>>>
>>>> other attached packages:
>>>> [1] annotate_1.40.1 RColorBrewer_1.0-5 gplots_2.14.1
>>>> [4] org.Hs.eg.db_2.10.1 ReportingTools_2.4.0 AnnotationDbi_1.24.0
>>>> [7] RSQLite_0.11.4 DBI_0.2-7 knitr_1.6
>>>> [10] biomaRt_2.18.0 DESeq2_1.4.5 RcppArmadillo_0.
>> 4.320.0
>>>> [13] Rcpp_0.11.2 GenomicRanges_1.14.4 XVector_0.2.0
>>>> [16] IRanges_1.20.7 affy_1.40.0 NOISeq_2.6.0
>>>> [19] Biobase_2.22.0 BiocGenerics_0.8.0
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] affyio_1.30.0 AnnotationForge_1.4.4 BiocInstaller_1.
>>>> 12.1
>>>> [4] Biostrings_2.30.1 biovizBase_1.10.8 bitops_1.0-
>>>> 6
>>>> [7] BSgenome_1.30.0 Category_2.28.0 caTools_1.
>>>> 17
>>>> [10] cluster_1.15.2 colorspace_1.2-4 dichromat_2.0-
>>>> 0
>>>> [13] digest_0.6.4 edgeR_3.4.2 evaluate_0.
>>>> 5.5
>>>> [16] formatR_0.10 Formula_1.1-1 gdata_2.
>>>> 13.3
>>>> [19] genefilter_1.44.0 geneplotter_1.40.0 GenomicFeatures_1.
>>>> 14.5
>>>> [22] ggbio_1.10.16 ggplot2_1.0.0 GO.db_2.
>>>> 10.1
>>>> [25] GOstats_2.28.0 graph_1.40.1 grid_3.
>>>> 1.0
>>>> [28] gridExtra_0.9.1 GSEABase_1.24.0 gtable_0.
>>>> 1.2
>>>> [31] gtools_3.4.1 Hmisc_3.14-4 hwriter_1.
>>>> 3
>>>> [34] KernSmooth_2.23-12 lattice_0.20-29 latticeExtra_0.6-
>>>> 26
>>>> [37] limma_3.18.13 locfit_1.5-9.1 MASS_7.3-
>>>> 33
>>>> [40] Matrix_1.1-4 munsell_0.4.2 PFAM.db_2.
>>>> 10.1
>>>> [43] plyr_1.8.1 preprocessCore_1.24.0 proto_0.3-
>>>> 10
>>>> [46] RBGL_1.38.0 RCurl_1.95-4.1 reshape2_1.
>>>> 4
>>>> [49] R.methodsS3_1.6.1 R.oo_1.18.0 Rsamtools_1.
>>>> 14.3
>>>> [52] rtracklayer_1.22.7 R.utils_1.32.4 scales_0.
>>>> 2.4
>>>> [55] stats4_3.1.0 stringr_0.6.2 survival_2.37-
>>>> 7
>>>> [58] tools_3.1.0 VariantAnnotation_1.8.13 XML_3.98-
>>>> 1.1
>>>> [61] xtable_1.7-3 zlibbioc_1.8.0
>>>>
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>
>>
>
More information about the Bioc-devel
mailing list