[Bioc-sig-seq] coverage vectors and subseq

Martin Morgan mtmorgan at fhcrc.org
Tue Aug 17 18:47:25 CEST 2010


kirti prakash <kirtiprakash3.14 at gmail.com> writes:

> Hi,
>
> I was reading the material "EMBL course on Short Read analysis with
> Bioconductor: An exercise with coverage vectors" by Simon Anders.

It took me a second to find

  http://bioconductor.org/help/course-materials/2009/EMBLJune09/Practicals/TSS/

is that the tutorial you mean?

>
> I tried this...
> aln <- readAligned("dirPath", pattern="sequence.map", type="Bowtie")
> cov = coverage(aln)
> cov
> [[1]]
> SimpleRleList of length 25
> $chr1
> 'integer' Rle of length 247188620 with 840106 runs
>  Lengths:    463     36   6823     36 550058 ...    713     36   2034     36
>  Values :      0      1      0      1      0 ...      0      1      0      2
>
> $chr10
> 'integer' Rle of length 135373320 with 446681 runs
>  Lengths: 88078    36  3880    36 12451 ...    20    50    36 22054    36
>  Values :     0     1     0     1     0 ...     1     0     1     0     1
> .
> .
> .
>
>>cvg <- cov$chr10
>> as.vector(subseq(cvg, 123456+50, 123456-50))

I think this should be

  seqselect(cvg$chr10, 123456+50, 123456-50)

or, to subset all chromosomes,

  endoapply(cvg, seqselect, 123456+50, 123456-50)

(though it doesn't seem like you'd usually want to select consistent
coordinates across chromosomes).

To get here, I looked up the help page for Rle-class

 > class?Rle

This is with

> sessionInfo()
R version 2.11.1 Patched (2010-06-03 r52215)
x86_64-unknown-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods
 base

 other attached packages:
 [1] ShortRead_1.6.2     Rsamtools_1.0.5     lattice_0.18-8
 [4] Biostrings_2.16.9   GenomicRanges_1.0.5 IRanges_1.6.11

 loaded via a namespace (and not attached):
 [1] Biobase_2.8.0 grid_2.11.1   hwriter_1.2   tools_2.11.1

Martin


> Error in function (classes, fdef, mtable)  :
>  unable to find an inherited method for function "subseq", for signature "list"
> Error in as.vector(subseq(cvg, 123456 + 50, 123456 - 50)) :
>  error in evaluating the argument 'x' in selecting a method for
> function 'as.vector'
>
> I guess * 'integer' Rle of length* should be 'numeric' Rle of length
> as per the booklet ... but I don't know how to fix it.
>
> I know I am making some stupid mistake. It would be great if anyone
> can provide some help on this.
>
> Thank you,
>
> Best regards,
>
> Kirti Prakash
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioc-sig-sequencing mailing list