[Bioc-sig-seq] about N statistics

Steve Lianoglou mailinglist.honeypot at gmail.com
Tue Sep 6 22:42:14 CEST 2011


Hi,

Are you looking for the number of reads that have 0, 1, ..., X 'N's in them?

If so, you can stop here:

On Tue, Sep 6, 2011 at 4:22 PM, wang peter <wng.peter at gmail.com> wrote:
> i used a stupid way to do statistics on the reads distribution varied with N
> number
>
> library(ShortRead)
> reads <- readFastq(fastqfile);
> ids<- id(reads);
> seqs <- sread(reads);
> # do you know how to get such information by a bioconductor function
> nCount<-alphabetFrequency(seqs)[,"N"]

And do:

R> n.distro <- table(nCount)

or some such, I think.

But it seems like you should also have the same answer in nCountHist,
as you've done it below, no?

> nCountHist<-hist(nCount,breaks=max(nCount))
> nCountHist["breaks"]
> nCountHist["counts"]

If that's not what you need, then maybe you can be a bit more specific
about what you are after?

-steve

> $breaks
>  [1]  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
> 24
> [26] 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
> 49
> [51] 50 51 52 53 54 55
>> nCountHist["counts"]
> $counts
>  [1] 16988332     3975     4365     3099     2760     2473     2918     3045
>  [9]     3320     3028     3290     3560     4695     4546     3939     4255
> [17]     3899     4025     6764     3554     4056     2716     1812     1456
> [25]     1618     2133     2253     1809     1638      924      951      889
> [33]      931     1089     1868     3344      348       36       20       25
> [41]       12       16       10       24        9        4        4        3
> [49]        0        0        3        1        1        0        1
>
> what i need is just the count of reads varied with "N" number, like such
> above
>
> thx
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioc-sig-sequencing mailing list