[Bioc-devel] cigarToRleList fails

Hervé Pagès hpages at fhcrc.org
Fri Feb 21 20:17:34 CET 2014


Hi Gabriel,

On 02/20/2014 05:03 PM, Gabriel Becker wrote:
> Herve,
>
> The help is correct (though possibly a bit pedantic),  there is no
> method for that signature.

But the dispatch mechanism is able to find one because
'import(path_to_gff)' *does* work. So the help maybe is correct
but that doesn't really help the naive user.

H.

>
> ?import("", "")
>
> works for me though
>
> ~G
>
>
> On Thu, Feb 20, 2014 at 4:51 PM, Hervé Pagès <hpages at fhcrc.org
> <mailto:hpages at fhcrc.org>> wrote:
>
>     On 02/20/2014 04:16 PM, Michael Lawrence wrote:
>
>         There's also "?coverage(ga)", so the user can see what happens
>         specifically for their object, without worrying about typing out
>         the class.
>
>
>     I was never lucky with this:
>
>        > library(rtracklayer)
>        > path_to_gff <- system.file("tests", "v1.gff", package =
>     "rtracklayer")
>        > ?import(path_to_gff)
>        Error in .helpForCall(topicExpr, parent.frame()) :
>          no documentation for function ‘import’ and signature ‘con =
>     "character", format = "ANY", text = "ANY"’
>        In addition: Warning message:
>        In .helpForCall(topicExpr, parent.frame()) :
>          no method defined for function ‘import’ and signature ‘con =
>     "character", format = "ANY", text = "ANY"’
>
>     H.
>
>
>         At some point it would be neat to generate S4 documentation at
>         run-time.
>         Just popup a browser and see every method that might be
>         dispatched with
>         a given object. In theory, the R help server would support this.
>
>
>         On Thu, Feb 20, 2014 at 3:13 PM, Hervé Pagès <hpages at fhcrc.org
>         <mailto:hpages at fhcrc.org>
>         <mailto:hpages at fhcrc.org <mailto:hpages at fhcrc.org>>> wrote:
>
>
>
>              On 02/20/2014 02:55 PM, Martin Morgan wrote:
>
>                  On 02/20/2014 02:32 PM, Hervé Pagès wrote:
>
>                      Hi Jesper,
>
>                      On 02/20/2014 02:13 PM, Jesper Gådin wrote:
>
>                          Very true that it is quite difficult to find the
>                          documentation when one
>                          is not aware of its existence :P
>
>
>                      Yeah, this has been a source of frustration for
>         many people. And
>                      always a source of embarrassment (for us) when
>         teaching our
>                      software
>                      to beginners.
>
>                      I've started to change this. In the upcoming
>         version of BioC
>                      (2.14,
>                      scheduled for mid-April), when you'll do ?coverage,
>         you'll
>                      get to
>                      choose between the 3 man pages that document coverage
>                      methods (there
>                      is one in IRanges, one in GenomicRanges, and one in
>                      GenomicAlignments).
>
>                      I want to generalize this to other generics that have
>                      methods spread
>                      across several packages (e.g. findOverlaps, the
>         intra- and
>                      inter-range
>                      methods, etc...).
>
>                      Having to choose between several man pages every
>         time you do
>                      e.g.
>                      ?findOverlaps is a minor annoyance compared to not
>         being able to
>                      find the man page at all. (Note that if you already
>         know
>                      where is
>                      your favorite man page, you'll be able to direct
>         access it with
>                      e.g. ?GenomicRanges::findOverlaps). Nobody will
>         ever need to use
>                      the insane
>
>         ?`findOverlaps,GenomicRanges,____GenomicRanges-method` to
>
>
>
>                  tab completion helps, so that you don't need to be totally
>                  insane, just
>                  insane enough to know to start with
>
>                  ?"cover<tab>
>
>
>              and also insane enough to know which method you need to
>         pick up amongst
>              the 30+ methods listed by ?"findOverlaps<tab>
>              Overwhelming!
>
>              H.
>
>
>
>                  Martin
>
>                      open that man page again. Ever! (it will still work
>         though...)
>
>                      Cheers,
>                      H.
>
>
>                          coverage() is fast and beautiful. Thanks!
>
>                          /Jesper
>
>
>                          On Wed, Feb 19, 2014 at 9:21 PM, Hervé Pagès
>                          <hpages at fhcrc.org <mailto:hpages at fhcrc.org>
>         <mailto:hpages at fhcrc.org <mailto:hpages at fhcrc.org>>
>                          <mailto:hpages at fhcrc.org
>         <mailto:hpages at fhcrc.org> <mailto:hpages at fhcrc.org
>         <mailto:hpages at fhcrc.org>>>> wrote:
>
>                               Hi Jesper,
>
>
>                               On 02/19/2014 08:44 AM, Michael Lawrence
>         wrote:
>
>                                   On Wed, Feb 19, 2014 at 8:39 AM,
>         Jesper Gådin
>                                   <jesper.gadin at gmail.com
>         <mailto:jesper.gadin at gmail.com>
>                          <mailto:jesper.gadin at gmail.com
>         <mailto:jesper.gadin at gmail.com>__>
>                          <mailto:jesper.gadin at gmail.com
>         <mailto:jesper.gadin at gmail.com>
>                          <mailto:jesper.gadin at gmail.com
>         <mailto:jesper.gadin at gmail.com>__>__>>wrote:
>
>
>                                       Hi Michael,
>
>                                       Herves suggestion will probably
>         work for my
>                          use case, but if
>                                       there are any
>                                       smoother ways it would be preferable.
>
>                                       The use case is as follows:
>
>                                       1) calculate strand specific
>         coverage over
>                          a region from
>                                       GAlignments object (or file)
>
>                                       At the moment I read a file using
>                          readGAlignmentsFromBam()
>                                       with tag="XS",
>                                       then filter it on "flag" and
>         "mapq". Then I
>                          subset the
>                                       resulting GAlignments in a minus
>         and a plus
>                          -strand object.
>                                       Then I calculate coverage by my own
>                          coverage function which
>                                       uses the cigar
>                                       information in the GAlignments
>         object. This
>                          function is the
>                                       one using
>                                       cigarToRleList() at the moment. As I
>                          understand the
>                                       coverage() function
>                                       from the GenomicAlignments/IRanges
>         package
>                          does not take
>                                       into account
>                                       cigars, or does it?
>
>
>                                   It does take the coverage into account;
>                          specifically to exclude
>                                   the introns
>                                   from coverage. I think there's also an
>         option
>                          to exclude
>                          deletions.
>
>
>                               Unfortunately the man page is not easy to
>         access
>                          (you need to do
>                               ?`coverage,GAlignments-method`______), but
>         it says:
>
>                                   The methods for GAlignments and
>         GAlignmentPairs
>                          objects do:
>
>                                     coverage(grglist(x), ...)
>
>                               And if you do grglist() on a GAlignments or
>                          GAlignmentPairs
>                          objects, the
>                               ranges you get in the returned GRangesList
>         object
>                          are calculated
>                          based
>                               on the CIGAR.
>
>                               Trust but verify. Here is how you can actually
>                          verify that
>                          coverage()
>                               does take the CIGAR into account:
>
>                                  library(RNAseqData.HNRNPC.bam.______chr14)
>                                  gal <-
>
>         readGAlignments(RNAseqData.______HNRNPC.bam.chr14_BAMFILES[1])
>
>                                  cig_op_table <- cigarOpTable(cigar(gal))
>
>                               First we pick up an alignment with an N in its
>                          CIGAR and do
>                          coverage()
>                               on it:
>
>                                  > gal_with_N <- gal[which(cig_op_table[
>         , "N"]
>                          != 0)[1]]
>
>                                  > gal_with_N
>                                  GAlignments with 1 alignment and 0
>         metadata columns:
>                                        seqnames strand       cigar    qwidth
>                          start       end
>                                  width
>                                           <Rle>  <Rle> <character> <integer>
>                          <integer> <integer>
>                               <integer>
>                                    [1]    chr14      + 55M2117N17M        72
>                            19650072  19652260
>                                   2189
>                                             ngap
>                                        <integer>
>                                    [1]         1
>                                    ---
>                                    seqlengths:
>                                                    chr1
>         chr10 ...
>                                 chrY
>                                               249250621
>         135534747 ...
>                                 59373566
>
>                                  > coverage(gal_with_N)$chr14
>                                  integer-Rle of length 107349540 with 5 runs
>                                    Lengths: 19650071       55     2117
>              17
>                          87697280
>                                    Values :        0        1        0
>               1
>                                0
>
>                               Same thing with an alignment with an I in
>         its CIGAR:
>
>                                  > gal_with_I <- gal[which(cig_op_table[
>         , "I"]
>                          != 0)[1]]
>
>                                  > gal_with_I
>                                  GAlignments with 1 alignment and 0
>         metadata columns:
>                                        seqnames strand       cigar    qwidth
>                          start       end
>                                  width
>                                           <Rle>  <Rle> <character> <integer>
>                          <integer> <integer>
>                               <integer>
>                                    [1]    chr14      -     64M1I7M        72
>                            19411677  19411747
>                                     71
>                                             ngap
>                                        <integer>
>                                    [1]         0
>                                    ---
>                                    seqlengths:
>                                                    chr1
>         chr10 ...
>                                 chrY
>                                               249250621
>         135534747 ...
>                                 59373566
>
>                                  > coverage(gal_with_I)$chr14
>                                  integer-Rle of length 107349540 with 3 runs
>                                    Lengths: 19411676 71 87937793
>         <tel:71%2087937793>
>                                    Values :        0        1        0
>
>                               Same thing with an alignment with a D in
>         its CIGAR:
>
>                                  > gal_with_D <- gal[which(cig_op_table[
>         , "D"]
>                          != 0)[1]]
>
>                                  > gal_with_D
>                                  GAlignments with 1 alignment and 0
>         metadata columns:
>                                        seqnames strand       cigar    qwidth
>                          start       end
>                                  width
>                                           <Rle>  <Rle> <character> <integer>
>                          <integer> <integer>
>                               <integer>
>                                    [1]    chr14      +    38M1D34M        72
>                            19659063  19659135
>                                     73
>                                             ngap
>                                        <integer>
>                                    [1]         0
>                                    ---
>                                    seqlengths:
>                                                    chr1
>         chr10 ...
>                                 chrY
>                                               249250621
>         135534747 ...
>                                 59373566
>
>                                  > coverage(gal_with_D)$chr14
>                                  integer-Rle of length 107349540 with 3 runs
>                                    Lengths: 19659062       73 87690405
>                                    Values :        0        1        0
>
>                               Seeing is believing,
>
>                               Cheers,
>                               H.
>
>
>
>                                       I started to look at the
>         applyPileups() in
>                          Rsamtools which I
>                                       can get to
>                                       calculate coverage using cigars,
>         but not
>                          using the strand or
>                                       flag
>                                       information for filtering. That
>         solution
>                          would start from a
>                                       bam-file
>                                       instead of a GAlignments object,
>         and sure I
>                          can do the
>                                       filtering outside R.
>                                       But it would be very convenient to
>         do it
>                          all from within R.
>
>                                       If there are nice solutions
>         starting from
>                          both a GAlignments
>                                       and a
>                                       bam-file it would be great! =)
>
>                                       /Jesper
>
>
>
>                                       On Tue, Feb 18, 2014 at 10:52 PM,
>         Michael
>                          Lawrence <
>         lawrence.michael at gene.com <mailto:lawrence.michael at gene.com>
>         <mailto:lawrence.michael at gene.__com
>         <mailto:lawrence.michael at gene.com>>
>                                       <mailto:lawrence.michael at gene.
>         <mailto:lawrence.michael at gene.>____com
>
>                          <mailto:lawrence.michael at gene.__com
>         <mailto:lawrence.michael at gene.com>>>> wrote:
>
>                                           Hi Jesper,
>
>                                           Would you be willing to
>         volunteer your
>                          use case? As
>                                           Herve hinted,
>                                           cigarToRleList and friends are
>                          low-level helpers. There
>                                           may be an easier
>                                           way to achieve what you want,
>         or an
>                          opportunity to
>                                           improve things.
>
>                                           Michael
>
>
>                                           On Mon, Feb 17, 2014 at 1:10
>         AM, Jesper
>                          Gådin
>                                           <jesper.gadin at gmail.com
>         <mailto:jesper.gadin at gmail.com>
>                          <mailto:jesper.gadin at gmail.com
>         <mailto:jesper.gadin at gmail.com>__>
>                                           <mailto:jesper.gadin at gmail.com
>         <mailto:jesper.gadin at gmail.com>
>                          <mailto:jesper.gadin at gmail.com
>         <mailto:jesper.gadin at gmail.com>__>__>>wrote:
>
>
>                                               Hi,
>
>                                               Have come across a
>         cigar-vector
>                          that is problematic
>                                               to process.
>
>                                               #load package
>
>                                                   library(GenomicAlignments)
>
>
>                                               #load data (see attached file)
>
>
>                          load("2014-02-17-cigarExample.______rdata")
>
>
>
>                                               #run function cigarToRleList
>
>                                                   cigarRle <-
>                          cigarToRleList(cigarExample)
>
>                                               Error in
>         .Call2("Rle_constructor",
>                          values, lengths,
>                                               check, 0L, PACKAGE =
>                                               "IRanges") :
>                                                   integer overflow while
>         summing
>                          elements in
>                          'lengths'
>
>                                                   sessionInfo()
>
>                                               R Under development (unstable)
>                          (2013-11-13 r64209)
>                                               Platform:
>         x86_64-unknown-linux-gnu
>                          (64-bit)
>
>                                               locale:
>                                                  [1] LC_CTYPE=en_US.UTF-8
>                          LC_NUMERIC=C
>                                                  [3] LC_TIME=en_US.UTF-8
>                          LC_COLLATE=en_US.UTF-8
>                                                  [5] LC_MONETARY=en_US.UTF-8
>                                                 LC_MESSAGES=en_US.UTF-8
>                                                  [7] LC_PAPER=en_US.UTF-8
>                          LC_NAME=C
>                                                  [9] LC_ADDRESS=C
>                          LC_TELEPHONE=C
>                                               [11]
>         LC_MEASUREMENT=en_US.UTF-8
>                          LC_IDENTIFICATION=C
>
>                                               attached base packages:
>                                               [1] parallel  stats
>         graphics
>                            grDevices utils
>                                                  datasets  methods
>                                               [8] base
>
>                                               other attached packages:
>                                               [1] GenomicAlignments_0.99.18
>                          Rsamtools_1.15.26
>                                               [3] Biostrings_2.31.12
>                            XVector_0.3.6
>                                               [5] GenomicRanges_1.15.26
>                          IRanges_1.21.23
>                                               [7] BiocGenerics_0.9.3
>
>                                               loaded via a namespace
>         (and not
>                          attached):
>                                                  [1] BatchJobs_1.1-1135
>         BBmisc_1.5
>                                               BiocParallel_0.5.8
>                                               bitops_1.0-6
>
>                                                  [5] brew_1.0-6
>                          codetools_0.2-8
>                          DBI_0.2-7
>                                                  digest_0.6.4
>
>                                                  [9] fail_1.2
>         foreach_1.4.1
>                                                 iterators_1.0.6    plyr_1.8
>
>                                               [13] RSQLite_0.11.4
>         sendmailR_1.1-2
>                                                 stats4_3.1.0
>         tools_3.1.0
>
>                                               [17] zlibbioc_1.9.0
>
>
>
>         _____________________________________________________
>         Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>         <mailto:Bioc-devel at r-project.__org
>         <mailto:Bioc-devel at r-project.org>>
>
>           <mailto:Bioc-devel at r-project.
>         <mailto:Bioc-devel at r-project.>____org
>                          <mailto:Bioc-devel at r-project.__org
>         <mailto:Bioc-devel at r-project.org>>> mailing list
>         https://stat.ethz.ch/mailman/______listinfo/bioc-devel
>         <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
>
>         <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>         <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>
>
>
>
>         <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>         <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
>
>         <https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>         <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>>
>
>
>
>
>
>                                            [[alternative HTML version
>         deleted]]
>
>
>
>
>
>           _____________________________________________________
>         Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>                          <mailto:Bioc-devel at r-project.__org
>         <mailto:Bioc-devel at r-project.org>>
>                          <mailto:Bioc-devel at r-project.
>         <mailto:Bioc-devel at r-project.>____org
>                          <mailto:Bioc-devel at r-project.__org
>         <mailto:Bioc-devel at r-project.org>>>
>                                   mailing list
>         https://stat.ethz.ch/mailman/______listinfo/bioc-devel
>         <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
>
>         <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>         <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>
>
>
>
>         <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>         <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
>
>         <https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>         <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>>
>
>
>                               --
>                               Hervé Pagès
>
>                               Program in Computational Biology
>                               Division of Public Health Sciences
>                               Fred Hutchinson Cancer Research Center
>                               1100 Fairview Ave. N, M1-B514
>                               P.O. Box 19024
>                               Seattle, WA 98109-1024
>
>                               E-mail: hpages at fhcrc.org
>         <mailto:hpages at fhcrc.org> <mailto:hpages at fhcrc.org
>         <mailto:hpages at fhcrc.org>>
>                          <mailto:hpages at fhcrc.org
>         <mailto:hpages at fhcrc.org> <mailto:hpages at fhcrc.org
>         <mailto:hpages at fhcrc.org>>>
>
>                               Phone: (206) 667-5791
>         <tel:%28206%29%20667-5791> <tel:%28206%29%20667-5791>
>                          <tel:%28206%29%20667-5791>
>                               Fax: (206) 667-1319
>         <tel:%28206%29%20667-1319> <tel:%28206%29%20667-1319>
>                          <tel:%28206%29%20667-1319>
>
>
>
>
>
>
>              --
>              Hervé Pagès
>
>              Program in Computational Biology
>              Division of Public Health Sciences
>              Fred Hutchinson Cancer Research Center
>              1100 Fairview Ave. N, M1-B514
>              P.O. Box 19024
>              Seattle, WA 98109-1024
>
>              E-mail: hpages at fhcrc.org <mailto:hpages at fhcrc.org>
>         <mailto:hpages at fhcrc.org <mailto:hpages at fhcrc.org>>
>              Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>         <tel:%28206%29%20667-5791>
>              Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>         <tel:%28206%29%20667-1319>
>
>
>
>     --
>     Hervé Pagès
>
>     Program in Computational Biology
>     Division of Public Health Sciences
>     Fred Hutchinson Cancer Research Center
>     1100 Fairview Ave. N, M1-B514
>     P.O. Box 19024
>     Seattle, WA 98109-1024
>
>     E-mail: hpages at fhcrc.org <mailto:hpages at fhcrc.org>
>     Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>     Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>
>     _________________________________________________
>     Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org> mailing list
>     https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>     <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>
>
>
>
> --
> Gabriel Becker
> Graduate Student
> Statistics Department
> University of California, Davis

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list