[Bioc-devel] as.list of a GRanges

Hervé Pagès hpages at fredhutch.org
Tue Feb 20 07:18:32 CET 2018


Hi Renan,

Most packages affected by these changes are packages that loop on
the individual ranges of a GRanges object. They generally don't
call as.list() directly but use something like lapply(), vapply(),
sapply(), Map(), Reduce(), etc... All these functions indeed call
as.list() internally on the supplied object before looping on it.
Just to clarify, when I say I found a dozen of Bioconductor packages
in the entire software repo where as.list() was used on a GRanges
object, I'm counting all the packages that use it explicitly or
implicitly. This includes signeR, which I had on my list of packages
to fix.

BTW in this particular instance, I would recommend doing

     reduce(granges, drop.empty.ranges=TRUE)

instead of

     Reduce(union, as(granges, "GRangesList"))

reduce() walks on the individual ranges of the supplied object at
the C level so is much faster than performing a binary union in
an R loop. It should also be more memory efficient.

Cheers,
H.


On 02/16/2018 09:02 AM, Renan Valieris wrote:
> FWIW, this change also affects code that don't call as.list() explicitly.
> 
> such as calling Reduce(union, granges), Reduce is implemented on base, and
> will call as.list() if the predicate isn't a vector already.
> 
> I understand it wasn't intended to be used this way, but with this in mind
> there are more packages potentially affected by the change.
> 
> On Fri, Feb 16, 2018 at 1:25 PM, Nathan Sheffield <nathan at code.databio.org>
> wrote:
> 
>> For what it's worth, my package (LOLA) was one that used as.list on a
>> GRanges or GRangesList, and those calls were broken by changes to devel.
>> Since I was also pushing changes at the time, I assumed the devel build
>> errors were due to my updates -- I spent quite a bit of time trying to
>> figure out what was wrong before I realized this breakage was not caused by
>> my updates, but by upstream changes in GRanges...eventually I tracked down
>> errors to as.list (and ultimately, found other errors, which we discussed
>> earlier on this list), but my conclusion from this was that, from my
>> perspective, using the deployed bioc devel as a way to test for what
>> refactoring will break doesn't seem like the ideal way to go -- I assumed
>> that generally, other package changes wouldn't typically be pushed that
>> would break my package's build, so it devalued the role of the dev builds
>> and reduced my confidence in using that (now when I see error I may assume
>> it's something else, and wait a few days, instead of diving right in to try
>> to solve the problem).
>>
>> I like the idea of temporarily restoring as.list with a deprecation
>> message -- also, as a general development philosophy going forward in terms
>> of testing on devel. This would have saved me a lot of time troubleshooting
>> in this instance.
>>
>> Just my 2 cents.
>>
>> -Nathan
>>
>>
>>
>> On 02/16/2018 02:57 AM, Bernat Gel wrote:
>>
>>> Hi Hervé and others,
>>>
>>> Thanks for the responses.
>>>
>>> I woudn't call as.list() of a GRanges an "obscure behaviour" but more a
>>> "works as expected, even if not clearly documented" behaviour.
>>>
>>> In any case I can change the code to as(gr, "GRangesList") as suggested.
>>>
>>> Thanks again for the responses and discussion :)
>>>
>>> Bernat
>>>
>>>
>>> *Bernat Gel Moreno*
>>> Bioinformatician
>>>
>>> Hereditary Cancer Program
>>> Program of Predictive and Personalized Medicine of Cancer (PMPPC)
>>> Germans Trias i Pujol Research Institute (IGTP)
>>>
>>> Campus Can Ruti
>>> Carretera de Can Ruti, Camí de les Escoles s/n
>>> 08916 Badalona, Barcelona, Spain
>>>
>>> Tel: (+34) 93 554 3068
>>> Fax: (+34) 93 497 8654
>>> 08916 Badalona, Barcelona, Spain
>>> bgel at igtp.cat <mailto:bgel at igtp.cat>
>>> www.germanstrias.org <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.germanstrias.org_&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=u-uKbpvH_T_qRONe44P6puvfV2kgFjcrH7YBeLoAyOg&s=Jq9kJoc872fO0LkbqV1pjIvd522K7WQXmvwvgfOsNLw&e=>
>>>
>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.germanstrias.org_&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=u-uKbpvH_T_qRONe44P6puvfV2kgFjcrH7YBeLoAyOg&s=Jq9kJoc872fO0LkbqV1pjIvd522K7WQXmvwvgfOsNLw&e=>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> El 02/15/2018 a las 11:19 PM, Hervé Pagès escribió:
>>>
>>>> On 02/15/2018 01:57 PM, Michael Lawrence wrote:
>>>>
>>>>>
>>>>>
>>>>> On Thu, Feb 15, 2018 at 1:45 PM, Hervé Pagès <hpages at fredhutch.org
>>>>> <mailto:hpages at fredhutch.org>> wrote:
>>>>>
>>>>>      On 02/15/2018 11:53 AM, Cook, Malcolm wrote:
>>>>>
>>>>>          Hi,
>>>>>
>>>>>          Can I ask, is this change under discussion in current release or
>>>>>          so far in Bioconductor devel only (my assumption)?
>>>>>
>>>>>
>>>>>      Bioconductor devel only.
>>>>>
>>>>>
>>>>>             > On 02/15/2018 08:37 AM, Michael Lawrence wrote:
>>>>>             > > So is as.list() no longer supported for GRanges objects?
>>>>>          I have found it
>>>>>             > > useful in places.
>>>>>             >
>>>>>             > Very few places. I found a dozen of them in the entire
>>>>>          software repo.
>>>>>
>>>>>          However there are probably more in the wild...
>>>>>
>>>>>
>>>>>      What as.list() was doing on a GRanges object was not documented.
>>>>> Relying
>>>>>      on some kind of obscure undocumented feature is never a good idea.
>>>>>
>>>>>
>>>>> There's just too much that is documented implicitly through inherited
>>>>> behaviors, or where we say things like "this data structure behaves as one
>>>>> would expect given base R". It's not fair to claim that those features are
>>>>> undocumented. Our documentation is not complete enough to use it as an
>>>>> excuse.
>>>>>
>>>>
>>>> It's not fair to suggest that this is a widely used feature either.
>>>>
>>>> I've identified all the places in the 1500 software packages where
>>>> this was used, and, as I said, there were very few places. BTW I
>>>> fixed most of them but my plan is to fix all of them. Some of the
>>>> code that is outside the Bioc package corpus might be affected but
>>>> it's fair to assume that this will be a very rare occurence. This can
>>>> be mitigated by temporary restoring as.list() on GRanges, with a
>>>> deprecation message, and wait 1 more devel cycle to replace it with
>>>> the new behavior. I chose to disable it for now, on purpose, so I can
>>>> identify packages that break (the build report is a great tool for
>>>> that) and fix them.
>>>>
>>>> I'm not using the fact that as.list() on a GRanges is not documented
>>>> as an excuse for anything. Only to help those with concerns to
>>>> relativize and relax.
>>>>
>>>> H.
>>>>
>>>>
>>>>>
>>>>>             > Now you should use as.list(as(gr, "GRangesList")) instead.
>>>>>             > as.list() was behaving inconsistently on IRanges and
>>>>>          GRanges objects,
>>>>>             > which is blocking new developments. It will come back with
>>>>>          a consistent
>>>>>             > behavior. More generally speaking IRanges and GRanges will
>>>>>          behave
>>>>>             > consistently as far as their "list interpretation" is
>>>>>          concerned.
>>>>>
>>>>>          Can we please be assured to be reminded of this prominently in
>>>>>          release notes?
>>>>>
>>>>>
>>>>>      The changes will be announced and described on this list and in the
>>>>>      NEWS files of the IRanges and GenomicRanges packages.
>>>>>
>>>>>      H.
>>>>>
>>>>>
>>>>>          Thanks!
>>>>>
>>>>>          ~malcolm
>>>>>
>>>>>
>>>>>      --     Hervé Pagès
>>>>>
>>>>>      Program in Computational Biology
>>>>>      Division of Public Health Sciences
>>>>>      Fred Hutchinson Cancer Research Center
>>>>>      1100 Fairview Ave. N, M1-B514
>>>>>      P.O. Box 19024
>>>>>      Seattle, WA 98109-1024
>>>>>
>>>>>      E-mail: hpages at fredhutch.org <mailto:hpages at fredhutch.org>
>>>>>      Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>>>>>      Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>>>>>
>>>>>
>>>>>
>>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=u-uKbpvH_T_qRONe44P6puvfV2kgFjcrH7YBeLoAyOg&s=Yxo1U1uq_3rOawuA6QpvoKSUDxn8OJvsjoIyGsLEHMY&e=
>>>
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=u-uKbpvH_T_qRONe44P6puvfV2kgFjcrH7YBeLoAyOg&s=Yxo1U1uq_3rOawuA6QpvoKSUDxn8OJvsjoIyGsLEHMY&e=
>>
>>
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=u-uKbpvH_T_qRONe44P6puvfV2kgFjcrH7YBeLoAyOg&s=Yxo1U1uq_3rOawuA6QpvoKSUDxn8OJvsjoIyGsLEHMY&e=
> 

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list