[Bioc-sig-seq] sum of maskedwidth vs. maskedwidth in 'all masks together'

Sean Davis sdavis2 at mail.nih.gov
Wed Jun 4 18:25:23 CEST 2008


On Wed, Jun 4, 2008 at 11:51 AM, Joseph Dhahbi, P.h.D.
<jdhahbi at chori.org> wrote:
>
> Hi
>
> If the masked width is the total number of nucleotide positions that are
> masked, then why the maskedwidth in 'all masks together:" is not equal to
> the sum of maskedwidth of all 3 masks?

Hi, Joseph.

The maskedwidth is the sum of the number of nucleotide positions
covered by the selected masks.  If a base is included in more than one
mask, it is counted only once.

Sean

>> chr2L=Dmelanogaster$chr2L
>> chr2L
>
>  23011544-letter "MaskedDNAString" instance (# for masking)
> seq:
> CGACAATGCACGACAGAGGAAGCAGAACAGATATTTAGATTGCCTCTCATTTTCTCTCCCATATTATAGGGAGAAATAT...CAATCAAACTGTGTTCGAAAAAGAGAAAACTAACATTTTTTTGGCATATTTGCAAATTTTGATGAACCCCCCTTTCAAA
> masks:
>  maskedwidth  maskedratio active                            names
> 1         200 8.691290e-06  FALSE                    assembly gaps
> 2     1966561 8.545976e-02  FALSE                     RepeatMasker
> 3       61603 2.677048e-03  FALSE Tandem Repeats Finder [period<=12]
> all masks together:
>  maskedwidth maskedratio
>      1988181  0.08639929
> all active masks together:
>  maskedwidth maskedratio
>            0           0
>
>
>> sum(200, 1966561, 61603)
>
> [1] 2028364



More information about the Bioc-sig-sequencing mailing list