[Bioc-sig-seq] What to do when single reads make up a large percentage of counts?

Ludo Pagie lpagie at xs4all.nl
Mon Aug 24 11:24:45 CEST 2009


Hi Jenny

Maybe this doesn't answer your question directly but it seems very
relevant:

Limitations and possibilities of small RNA digital gene
expression profiling.

Linsen SE, de Wit E, Janssens G, Heater S, Chapman L, Parkin RK,
Fritz B, Wyman SK, de Bruijn E, Voest EE, Kuersten S, Tewari M,
Cuppen E.

Nat Methods. 2009 Jul;6(7):474-6. 

best, Ludo

On Tue, Aug 11, 2009 at 02:42:11PM -0500, Jenny Drnevich wrote:
> Hi everyone,
>
> I thought I would try this list before the general Bioconductor one  
> because my question pertains to NGS counts, although in reality it's a 
> general statistical theory question. I hope someone can help me or point 
> me in the right direction! Typically, you cannot compare counts from 
> different samples directly, but instead you have adjust by the total 
> number of counts obtained for each sample, correct?  This assumes that 
> any changes in the counts of particular sequences will not substantially 
> affect the total count number... but what if it might? I'm helping a 
> colleague with some data where they sequenced the 18-30 nt fraction of 
> RNA to look for miRNAs; they got 1.1 to 2.1 million reads per sample, but 
> these aligned to only 187 miRNAs! Some of the miRNAs have up to 30% of 
> all reads, which is a really large percentage. Say a miRNA "X" that is 
> 30% of the reads doubles its count number in another sample, but the 
> counts for all other miRNAs are the same. The new percentage of "X" in 
> the second sample is not 60%, but instead 46.15%, and the observed ratios 
> of all the other miRNAs are decreased by a factor or 0.77 (= 1/1.3). Is 
> there any way to correct for this? What do you do when the top 5 miRNAs 
> make up 70% of the counts??
>
> Thanks,
> Jenny
>
> Jenny Drnevich, Ph.D.
>
> Functional Genomics Bioinformatics Specialist
> W.M. Keck Center for Comparative and Functional Genomics
> Roy J. Carver Biotechnology Center
> University of Illinois, Urbana-Champaign
>
> 330 ERML
> 1201 W. Gregory Dr.
> Urbana, IL 61801
> USA
>
> ph: 217-244-7355
> fax: 217-265-5066
> e-mail: drnevich at illinois.edu
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing



More information about the Bioc-sig-sequencing mailing list