[Bioc-sig-seq] read length distribution from ShortReadQ or AlignedRead

Martin Morgan mtmorgan at fhcrc.org
Sun Aug 22 21:55:08 CEST 2010


On 08/22/2010 11:32 AM, joseph wrote:
> Perfect. Thanks.

a little tangential, but plot(table(width(reads))) avoids the arbitrary
histogram bins when the number of unique values of width(reads) is
small, and library(lattice); densityplot(width(reads),
plot.points=FALSE) avoids the arbitrary choice of bin widths when the
number of unique values of width(reads) is large.

Martin

> 
> 
> 
> 
> ________________________________
> From: Steve Lianoglou <mailinglist.honeypot at gmail.com>
> 
> Cc: bioc-sig-sequencing at r-project.org
> Sent: Sun, August 22, 2010 10:59:37 AM
> Subject: Re: [Bioc-sig-seq] read length distribution from ShortReadQ or 
> AlignedRead
> 
> Sorry .. didn't CC to bioc (like I told you to do :-) ...
> 
> On Sun, Aug 22, 2010 at 1:58 PM, Steve Lianoglou
> <mailinglist.honeypot at gmail.com> wrote:
>> Hi Joseph,
>>
>> Don't forget to hit "Reply All" when responding to BioC emails, so
>> that the help stays on the list ..
>>
> 
>>> Thanks Steve.
>>> how do I get a table of the actual values?
>>
>> The `width` function returns a vector of the same length as your
>> ShortRead object (which is the number of reads you have there), so
>> assuming your ShortRead object is still called `reads`:
>>
>> R> read.length <- width(reads)
>>
>> That `read.length` vector has the length of all the reads. You can
>> then manipulate it as you would any other vector in R. It sounds like
>> you might want the `table` or `tabulate` function, or something
>> similar.
>>
>> Also note that the `hist` function returns an "invisible" object you
>> can work with. If you do something like this:
>>
>> R> h <- hist(width(reads))
>>
>> `h` might look something like this:
>>
>> R> h
>> $breaks
>> [1] 30 40
>>
>> $counts
>> [1] 256
>>
>> $intensities
>> [1] 0.1
>>
>> $density
>> [1] 0.1
>>
>> $mids
>> [1] 35
>>
>> $xname
>> [1] "width(rfq)"
>>
>> $equidist
>> [1] TRUE
>>
>> attr(,"class")
>> [1] "histogram"
>>
>> -steve
>>
>> --
>> Steve Lianoglou
>> Graduate Student: Computational Systems Biology
>>  | Memorial Sloan-Kettering Cancer Center
>>  | Weill Medical College of Cornell University
>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>>
> 
> 
> 


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioc-sig-sequencing mailing list