[BioC] DESeq2 transcript level

Yuan Hao yuan.x.hao at gmail.com
Fri May 9 20:43:37 CEST 2014


Hi Alicia, 

Here is my two cents. I think RSEM isoform counts contain ‘multi-isoform’ reads, i.e. reads mapped to multiple isoforms, but from a single gene, so I don’t think DESeq2 would demonstrate it’s power in terms of accuracy in your case. If your purpose is only to obtain an ‘inspiration of biology’ from your data, why not simply use the fold change of your isoform expression and to check out what’s the most changed ones? Yes, you probably need some kind of normalization to leverage the library sizes. 

Cheers,
Yuan


On May 9, 2014, at 2:08 PM, Alicia R. Pérez-Porro <alicia.r.perezporro at gmail.com> wrote:

> Hi everyone,
> 
> Yes, sorry Mike, I forgot to mention that I calculated my isoform counts
> estimation with RSEM.
> 
> Thomas, I am aware about the pipeline using RSEM+(edgeR or DESeq) to do DE
> analysis inside Trinity. I tried and I didn't like just because is not a
> good option to use with my data. I have non standard data and the pipeline
> is too hands off so works nicely if you have more standard data.
> 
> Mainly what I want is to do pair-wise comparisons between my 5 different
> conditions (5 different moments along the life cycle of my animal) to
> generate a list of the most DEG, blast them to see what they are,
> potentially get the GO:terms associated to them and have an overview of
> what is happening along the life cycle. My intention in the future is
> probably continue with analysis in differential exon expression (DEXseq)
> and/or differential isoform expression. And I am also considering
> time-scale differential expression analysis. But like I said I'm planning
> on doing that in the future because right now I am interested just in an
> overview of what is going on in the life cycle of my animal. That's why I
> was thinking about treating the isoforms as genes.
> 
> Does that make sense to you, guys?
> 
> Thanks again,
> Alicia
> 
> 
> 
> --
> Alicia R. Pérez-Porro
> PhD candidate
> 
> Giribet lab
> Department of Organismic and Evolutionary Biology
> MCZ labs
> Harvard University
> 26 Oxford St, Cambridge MA 02138
> phone: +1 617-496-5308
> fax: +1 617-495-5667
> www.oeb.harvard.edu/faculty/giribet/
> 
> Department of Marine Ecology
> Center for Advanced Studies of Blanes (CEAB-CSIC)
> C/Accés Cala St. Francesc 14
> 17300 Blanes, Girona, SPAIN
> phone: +34 972 336 101
> fax: +34 972 337 806
> www.ceab.csic.es
> 
> 
> On Fri, May 9, 2014 at 1:45 PM, Michael Love <michaelisaiahlove at gmail.com>wrote:
> 
>> hi Alicia,
>> 
>> In your previous email, you didn't mention that you had used software
>> for estimating isoform specific counts, so I pasted a link to Simon's
>> answer concerning isoform counts. If one can assign reads to isoforms
>> with high confidence then it would be fine to do testing with DESeq2.
>> However, this is typically a difficult task which involves uncertainty
>> in estimation and DESeq2 does not take into account any uncertainty in
>> these estimated counts. If you have 10 reads in the count matrix,
>> DESeq2 will take this to mean 10 reads were unambiguously assigned to
>> this feature. Through estimation steps, this could be 10 reads plus or
>> minus 10 reads. So the quality of the results will depend on the
>> quality of the input.
>> 
>> Mike
>> 
>> On Fri, May 9, 2014 at 11:11 AM, Alicia R. Pérez-Porro
>> <alicia.r.perezporro at gmail.com> wrote:
>>> Hi Mike,
>>> 
>>> Thanks for your answer. I read the explanation from Simon and am still a
>>> little bit confuse, reading the edgeR manual I found this: "edgeR can be
>>> applied to di fferential expression at the gene, exon, transcript or tag
>>> level. In fact, read counts can be summarized by any genomic feature.
>> edgeR
>>> analyses at the exon level are easily extended to detect di fferential
>>> splicing or isoform-specifi c dif ferential expression".
>>> 
>>> So, my initial idea was to treat each isoform like a gene because for the
>>> approach that I am attempting right now I am not interested in
>> differences
>>> in expression of isoforms belonging to the same gene. I did my de novo
>> with
>>> Trinity, alignment using Bowtie and estimation of abundance with RSEM. I
>>> created a matrix with the RSEM counts (isoforms results) and I was
>> planning
>>> on using it as input for my next step.
>>> 
>>> I understand that even I want to treat each isoform as a gene to do it
>> with
>>> DESeq or DESeq2 is not a good idea. I really don't know about edgeR and
>> am
>>> also considering EBseq.
>>> 
>>> Suggestions? Thoughts?
>>> 
>>> Thank you in advance.
>>> Alicia
>>> 
>>> 
>>> 
>>> --
>>> Alicia R. Pérez-Porro
>>> PhD candidate
>>> 
>>> Giribet lab
>>> Department of Organismic and Evolutionary Biology
>>> MCZ labs
>>> Harvard University
>>> 26 Oxford St, Cambridge MA 02138
>>> phone: +1 617-496-5308
>>> fax: +1 617-495-5667
>>> www.oeb.harvard.edu/faculty/giribet/
>>> 
>>> Department of Marine Ecology
>>> Center for Advanced Studies of Blanes (CEAB-CSIC)
>>> C/Accés Cala St. Francesc 14
>>> 17300 Blanes, Girona, SPAIN
>>> phone: +34 972 336 101
>>> fax: +34 972 337 806
>>> www.ceab.csic.es
>>> 
>>> 
>>> On Fri, May 9, 2014 at 8:07 AM, Michael Love <
>> michaelisaiahlove at gmail.com>
>>> wrote:
>>>> 
>>>> hi Alicia,
>>>> 
>>>> DESeq2 is intended for gene-level analysis. here is some explanation
>> from
>>>> Simon on the list as to why doing a count-based analysis at the
>>>> transcript/isoform level is problematic:
>>>> 
>>>> https://stat.ethz.ch/pipermail/bioconductor/2012-February/043410.html
>>>> 
>>>> if you are interested in looking for differential exon usage, you can
>>>> instead consider using the DEXSeq package:
>>>> 
>>>> publication
>>>> http://www.ncbi.nlm.nih.gov/pubmed/22722343
>>>> 
>>>> software:
>>>> http://bioconductor.org/packages/release/bioc/html/DEXSeq.html
>>>> 
>>>> Mike
>>>> 
>>>> 
>>>> 
>>>> On Thu, May 8, 2014 at 7:26 PM, Alicia R. Pérez-Porro
>>>> <alicia.r.perezporro at gmail.com> wrote:
>>>>> 
>>>>> Dear Bioconductor users,
>>>>> 
>>>>> I am working with differential expression at the transcript (isoform)
>>>>> level. I have 6 different conditions (2 replicates/condition).
>>>>> 
>>>>> I want to know if I can use DESeq2 for that or if the package can only
>> be
>>>>> used at the gene level.
>>>>> 
>>>>> Thanks,
>>>>> Alicia
>>>>> 
>>>>> 
>>>>> --
>>>>> Alicia R. Pérez-Porro
>>>>> PhD candidate
>>>>> 
>>>>> Giribet lab
>>>>> Department of Organismic and Evolutionary Biology
>>>>> MCZ labs
>>>>> Harvard University
>>>>> 26 Oxford St, Cambridge MA 02138
>>>>> phone: +1 617-496-5308
>>>>> fax: +1 617-495-5667
>>>>> www.oeb.harvard.edu/faculty/giribet/
>>>>> 
>>>>> Department of Marine Ecology
>>>>> Center for Advanced Studies of Blanes (CEAB-CSIC)
>>>>> C/Accés Cala St. Francesc 14
>>>>> 17300 Blanes, Girona, SPAIN
>>>>> phone: +34 972 336 101
>>>>> fax: +34 972 337 806
>>>>> www.ceab.csic.es
>>>>> 
>>>>>        [[alternative HTML version deleted]]
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>> 
>>>> 
>>> 
>> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list