[Bioc-devel] is.unsorted method for GRanges objects

Michael Lawrence lawrence.michael at gene.com
Tue Nov 3 02:35:31 CET 2015


The notion of sortedness is already formally defined, which is why we have
an order method, etc.

The base is.unsorted implementation for "objects" ends up calling
base::.gt() for each adjacent pair of elements, which is likely too slow to
be practical, so we probably should add a custom method.

This does bring up the tangential question of whether GenomicRanges should
have an anyNA method that returns FALSE (and similarly an is.na() method),
although we have never defined the concept of a "missing range".

Michael

On Mon, Nov 2, 2015 at 4:55 PM, Gabe Becker <becker.gabe at gene.com> wrote:

> Pete,
>
> What does sorted mean for granges? If the starts  are sorted but the ends
> aren't does that count? What if only the ends are but the ranges are on the
> negative strand?
>
> Do we consider seqlevels to be ordinal in the order the levels are returned
> from seqlevels ()? That usually makes sense, but does it always?
>
> In essence I'm asking if sortedness is a well enough defined term for an
> is.sorted method to make sense.
>
> Best,
> ~G
> On Nov 2, 2015 4:27 PM, "Peter Hickey" <peter.hickey at gmail.com> wrote:
>
> > Hi all,
> >
> > I sometimes want to test whether a GRanges object (or some object with
> > a GRanges slot, e.g., a SummarizedExperiment object) is (un)sorted.
> > There is no is.unsorted,GRanges-method or, rather, it defers to
> > is.unsorted,ANY-method. I'm unsure that deferring to the
> > is.unsorted,ANY-method is what is really desired when a user calls
> > is.unsorted on a GRanges object, and it will certainly return a
> > (possibly unrelated) warning - "In is.na(x) : is.na() applied to
> > non-(list or vector) of type 'S4'".
> >
> >
> > For this reason, I tend to use is.unsorted(order(x)) when x is a
> > GRanges object. This workaround is also used, for example, by minfi
> > (https://github.com/kasperdanielhansen/minfi/blob/master/R/blocks.R#L121
> ).
> > However, this is slow because it essentially sorts the object to test
> > whether it is already sorted.
> >
> >
> > So, to my questions:
> >
> > 1. Have I overlooked a fast way to test whether a GRanges object is
> sorted?
> > 2a. Could a is.unsorted,GenomicRanges-method be added to the
> > GenomicRanges package? Side note, I'm unsure at which level to define
> > this method, e.g., GRanges vs. GenomicRanges.
> > 2b. Is it possible to have a sensible definition and implementation
> > for is.unsorted,GRangesList-method?
> > 2c. Could a is.unsorted,RangedSummarizedExperiment-method be added to
> > the SummarizedExperiment package?
> >
> > I started working on a patch for 2a/2c, but wanted to ensure I hadn't
> > overlooked something obvious. Also, I'm sure 2a/2b/2c could be written
> > much more efficiently at the C-level but I'm afraid this might be a
> > bit beyond my abilities to integrate nicely with the existing code.
> >
> > Thanks,
> > Pete
> >
> > _______________________________________________
> > Bioc-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list