[BioC] DEXSeq all p-values are 1
Ryan C. Thompson
rct at thompsonclan.org
Fri Dec 14 06:50:28 CET 2012
With only two replicates for each condition, you will not have a huge
amount of power to detect differential expression. Combine that with the
fact that you are looking at exons and not whole genes, so your counts
will be much lower, and you will have a difficult time achieving
statistical significance.
-Ryan
On 12/13/2012 08:59 AM, Philip Jonsson wrote:
> I realize that it can be hard to nail down a source/source of this
> (possible) error.
>
> So, the data is four single-end runs - two treated and two control-treated
> samples. The design I set up looks like this:
> condition replicate
> ./treatedsample1.txt Treatment 1
> ./treatedsample2.txt Treatment 2
> ./controlsample1.txt Vehicle 1
> ./controlsample2.txt Vehicle 2
>
> After using read.HTSeqCounts I use estimateSizeFactors, followed
> by estimateDispersions. For estimateDispersions I get the same results
> whether I set minCount to 0 or 10 and whether I set maxExon to default 70
> or something higher, like 1000. I have not tried to change initialGuess,
> since I don't really grasp what it means. I also leave formula unchanged.
> Afterwards I use fitDispersionFunction and testForDEU with default
> parameters.
>
> My ExonCountSet was created with the package's Python scripts. I used a GFF
> file from UCSC which is of the same genome build as used for the read
> mapping.
>
> There is variance across my replicates, but it doesn't seem too extreme. I
> could, as mentioned, call differentially expressed genes with DESeq without
> problems. Do the values for dispBeforeSharing, dispFitCoefs, or dispFitted
> tell me anything about this?
>
> On 13 December 2012 10:14, Steve Lianoglou
> <mailinglist.honeypot at gmail.com>wrote:
>
>> Hi,
>>
>> It's hard to provide any meaningful help, since we really don't have
>> any information about your data, or what you code (code examples) to
>> identify a problem.
>>
>> But:
>>
>> On Thu, Dec 13, 2012 at 10:08 AM, Philip [guest] <guest at bioconductor.org>
>> wrote:
>>> Hello,
>>>
>>> I'm trying to use DEXSeq to identify alternative exon usage. Using DESeq
>> I've identified ~200 differentially expressed genes in my gene set. I've
>> basically applied the guidelines from the manual to my data set - single
>> reads in duplicates +/- treatment.
>>
>> Does that mean you have 4, singe end read runs? 2 biological
>> replicates for (+) treatment, and 2 for (-) treatment?
>>
>>> I've played around with the parameters in different ways,
>> Which parameters?
>> What ways?
>> How did you count reads / bin?
>> How did you define the bins?
>>
>>> but no matter how I do it the adjusted p-values all come out as 1 or
>> N/A. The non-adjusted p-values are pretty high, so I reckon the adjusted
>> p-values are "true", however, when I go true single genes I find exons that
>> have really high fold-change values indicating differential expression.
>>
>> High dispersion values can make high fold changes statistically
>> insignificant.
>>
>> Did you explore the quality of your replicates? How? How do they look?
>>
>>> Is this a result one can expect (due to e.g. high variance in replicates)
>> That can explain it.
>>
>> There is, of course, always the possibility that there is very little
>> differential splicing in your experiment.
>>
>>> or is it possible that something is wrong in my analysis?
>> It is always possible that there is something wrong with an analysis.
>> As I mentioned at the start, without knowing more about your analysis
>> and seeing the code, there is no way that anybody can answer this
>> question.
>>
>> HTH,
>> -steve
>>
>> --
>> Steve Lianoglou
>> Graduate Student: Computational Systems Biology
>> | Memorial Sloan-Kettering Cancer Center
>> | Weill Medical College of Cornell University
>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list