[Bioc-sig-seq] counts differences among multiple RNA-seq samples

Kunbin Qu KQu at genomichealth.com
Sat Apr 17 03:05:21 CEST 2010


Nico and Kasper, 

thanks a lot for your input and advice. The problem I am trying to solve is more than just counting them, although that is the first step. One issue I need to deal with is how and when to merge the islands (separated by zero coverage in IRanges), since if just count the islands, there are too many. I saw some publications mentioned some hard cutoff, 15 bp, 30 bp or something like that, but really with no biological basis. Will known gene structure model (HMM etc.) be a help? Let's talk off line, Nico. I'd love to hear more from you. I am at AACR now, sorry for the delay. 

-Kunbin



-----Original Message-----
From: Nicolas Delhomme [mailto:delhomme at embl.de]
Sent: Thu 4/15/2010 10:38 AM
To: Kasper Daniel Hansen
Cc: Kunbin Qu; bioc-sig-sequencing at r-project.org
Subject: Re: [Bioc-sig-seq] counts differences among multiple RNA-seq samples
 
Hi Kasper,

You are correct, this sounds like a perfect Genominator use case.  
However, while working on my package, I realized that you can achieve  
the same with straight out of the box IRanges and Rsamtools/ShortRead  
functions, without having to format the data back and forth. This was  
important for me as I use many IRanges functionalities in my  
downstream analyses.

Cheers,

Nico

---------------------------------------------------------------
Nicolas Delhomme

High Throughput Functional Genomics Center

European Molecular Biology Laboratory

Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------




On 15 Apr 2010, at 17:43, Kasper Daniel Hansen wrote:

> If you are mainly interested in counting, you should check out
> Genominator which has been capable of doing this for a large number of
> samples for a long time.  It should be fairly easy to use, with the
> biggest huddle usually being reading in the data at first.
>
> Kasper
>
> On Thu, Apr 15, 2010 at 11:23 AM, Nicolas Delhomme  
> <delhomme at embl.de> wrote:
>> Hi Kunbin,
>>
>> I'm currently developing an R package that does something close to  
>> what you
>> describe. Maybe we can discuss more in details what you need, off  
>> list, to
>> see if I can help you out? If it turns out to be the case, then  
>> we'll post
>> back the result to the list.
>>
>> Cheers,
>>
>> ---------------------------------------------------------------
>> Nicolas Delhomme
>>
>> High Throughput Functional Genomics Center
>>
>> European Molecular Biology Laboratory
>>
>> Tel: +49 6221 387 8310
>> Email: nicolas.delhomme at embl.de
>> Meyerhofstrasse 1 - Postfach 10.2209
>> 69102 Heidelberg, Germany
>> ---------------------------------------------------------------
>>
>>
>>
>>
>> On 3 Apr 2010, at 05:48, Kunbin Qu wrote:
>>
>>> Hi,
>>>
>>> I have run RNA-seq on 4 human samples, and I'd like to look at the  
>>> count
>>> number from each sample at regions where any of the sample has  
>>> some read
>>> coverage (say, threshold of 5 reads). What is the best way to do  
>>> this? It is
>>> basically to examine the differentially expression regions across  
>>> the
>>> transcriptome, not just limited to known annotated regions. I  
>>> having been
>>> trying to use IRanges and related packages, but things start to  
>>> get hairy
>>> when come to cluster the reads, condense them (within certain bp  
>>> range),
>>> back-track the identities. I also looked at Cufflink, but it does  
>>> not seem
>>> to be for this purpose, isn't it? Any advice is highly appreciated.
>>>
>>> -Kunbin
>>>
>>>
>>>
>>>
>>> ______________________________________________________________________
>>> The contents of this electronic message, including any  
>>> attachments, are
>>> intended only for the use of the individual or entity to which  
>>> they are
>>> addressed and may contain confidential information. If you are not  
>>> the
>>> intended recipient, you are hereby notified that any use,  
>>> dissemination,
>>> distribution, or copying of this message or any attachment is  
>>> strictly
>>> prohibited. If you have received this transmission in error,  
>>> please send an
>>> e-mail to postmaster at genomichealth.com and delete this message,  
>>> along with
>>> any attachments, from your computer.
>>>
>>> _______________________________________________
>>> Bioc-sig-sequencing mailing list
>>> Bioc-sig-sequencing at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>



______________________________________________________________________
The contents of this electronic message, including any attachments, are intended only for the use of the individual or entity to which they are addressed and may contain confidential information. If you are not the intended recipient, you are hereby notified that any use, dissemination, distribution, or copying of this message or any attachment is strictly prohibited. If you have received this transmission in error, please send an e-mail to postmaster at genomichealth.com and delete this message, along with any attachments, from your computer.



More information about the Bioc-sig-sequencing mailing list