[Bioc-devel] request: high-level seqlevel utilities

Tim Triche, Jr. tim.triche at gmail.com
Fri Dec 27 21:53:02 CET 2013


Hi Sonali,

Thank you for this much requested addition to the GenomicRanges API!

I would like to second Michael Lawrence's assertion that keepAutosomes() is more straightforward and harder to misunderstand or misuse than a special case of keep/dropSeqlevels.   (If the author of a function has trouble using it, think what that means for the rest of us!)  An ideal API is (IMHO) one that is easy to use well, and hard to use poorly. I propose that this exchange has illustrated why a very experienced programmer (ML) would nonetheless suggest API "bloat". 

The recent history of the Ranges API has (again IMHO) been a series of exchanges balancing convenience against elegance, often with significant disagreements on where the balance should lie. But if a "special case" turns out to be the common case, historically, a variance has been granted. Examples include sortSeqlevels(), as(data.frame, "GRanges"), and x$foo as shorthand for elementMetadata(x)[,'foo'].  

People seldom do what they know to be right. They do what is convenient, then repent. (Ask me how I know ;-))  In my experience, at least, the best way to get people to "do the right thing" is to make it convenient.

Thank you for being bold enough to stand up for the design choices you believe in. I hope my request to trade a little elegance for readability does not come across as criticism thereof. 

Best,

--t

> On Dec 27, 2013, at 12:11 PM, Michael Lawrence <lawrence.michael at gene.com> wrote:
> 
> I guess you mean dropSeqlevels, not keepSeqlevels? That should work. But
> your function needs some tweaking, I think. The function should just take
> the seqnameStyle from the object. There's no need to require the user to
> specify it, and then check for consistency. I would like to be able to just
> call:
> 
> keepStandardChromosomes(trans)
> 
> Also, I think you made GenomicRanges depend on AnnotationDbi. That's a
> major change. Before, it was just in Suggests. Did you discuss this with
> anyone?
> 
> Michael
> 
> 
> 
> 
>> On Fri, Dec 27, 2013 at 11:37 AM, Arora, Sonali <sarora at fhcrc.org> wrote:
>> 
>> Hi Michael,
>> 
>> We decided to come up with one function called keepStandardChromosomes()
>> which
>> is a wrapper for both the functions suggested by you (i.e.,
>> keepPrimaryChromosomes() and keepAutosomes() )
>> 
>> Here is an example:
>> 
>> library(AnnotationDbi)
>> library(GenomicRanges)
>> txdb <- loadDb(system.file("extdata", "UCSC_knownGene_sample.sqlite",
>>                           package="GenomicFeatures"))
>> 
>> trans <- transcripts(txdb)
>> seqlevels(trans)
>> 
>> #gets all GRanges for Human chromosomes : 1-22,X,Y,M
>> got <- keepStandardChromosomes(trans,style="UCSC", species="Homo sapiens")
>> seqlevels(got)
>> 
>> #for getting only autosomes, one could then extract using keepSeqlevels()
>> #as shown earlier:
>> keepSeqlevels(got,c("chrX","chrY"))
>> keepSeqlevels(got,c("chrX","chrY", "chrM"))
>> 
>> Thanks and Regards
>> Sonali.
>> 
>> On 12/25/2013 4:15 AM, Michael Lawrence wrote:
>> 
>> Awesome -- how about keepAutosomes?
>> 
>> 
>>> On Tue, Dec 24, 2013 at 1:11 PM, Arora, Sonali <sarora at fhcrc.org> wrote:
>>> 
>>> Hi everyone,
>>> 
>>> We are pleased to announce that we have added a new function called
>>> keepStandardChromosomes() to  GenomicRanges(1.15.18)  - devel branch.
>>> 
>>> This function allows a user to subset a given object (containing a
>>> seqinfo class) to
>>> retain only the primary chromosomes and the autosomes.
>>> 
>>> Please feel free to get back if you face any issues.
>>> 
>>> Thanks,
>>> Sonali Arora.
>>> 
>>> On 12/16/2013 11:01 AM, Michael Lawrence wrote:
>>> 
>>> Awesome, thanks, Sonali. And welcome to the team.
>>> 
>>> Michael
>>> 
>>> 
>>>> On Mon, Dec 16, 2013 at 10:38 AM, Arora, Sonali <sarora at fhcrc.org> wrote:
>>>> 
>>>> Hi Michael,
>>>> 
>>>> That is an extremely interesting question. We have a couple of ideas and
>>>> are beginning to work on it.
>>>> We hope to come up with something soon.
>>>> 
>>>> Thanks,
>>>> Sonali.
>>>> 
>>>> 
>>>>> On 12/16/2013 6:14 AM, Michael Lawrence wrote:
>>>>> 
>>>>>  should be stored with the Seqinfo. It could be imputed
>>>>> (along with the isCircular I think) via th
>>>> 
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 
>    [[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel



More information about the Bioc-devel mailing list