[Bioc-devel] Interoperability between DataFrame and dplyr?
lawrence.michael at gene.com
Fri Apr 24 17:07:14 CEST 2015
On Fri, Apr 24, 2015 at 7:42 AM, Jim Hester <james.f.hester at gmail.com>
> dplyr internally converts all `data.frame` objects to its `tbl_df` class
> and most dplyr methods operate on the `tbl` superclass, see (
I hope you're speaking only of the data frame implementation here.
> The most direct route would to getting DataFrame objects working be just
> to just provide a method that converts the `DataFrame` objects to
> `data.frame`, then call `tbl_df()` on that.
That coercion already exists, of course, and it's via the S3 as.data.frame,
so it should work already.
> However this would copy the data multiple times, so probably the best
> option would be to create a new `tbl_DF` class to handle `DataFrame`
> objects directly.
It doesn't copy the data, outside of the list of pointers (so it's pretty
much instantaneous), but yea, I agree a new implementation is the way to go.
> You can look in the various tbl-*.r files at (
> https://github.com/hadley/dplyr/blob/master/R/) to see what methods
> should be implemented.
> On Fri, Apr 24, 2015 at 10:16 AM, Michael Lawrence <
> lawrence.michael at gene.com> wrote:
>> Sure, but the way DataFrame is flexible is by relying on two abstractions
>> in base R. Just length() and '['. If dplyr does the same thing, which
>> totally reasonable, everything should work the same.
>> On Thu, Apr 23, 2015 at 4:32 PM, Vincent Carey <
>> stvjc at channing.harvard.edu>
>> > Seems to me that DataFrame is too flexible -- you can have very complex
>> > objects in the columns (anything that inherits from Vector) with which,
>> > its current state, dplyr would not work too naturally. You would wind
>> > doing a fair amount of coercion of such entities, so it seems to me that
>> > arranging a coercion of DataFrames satisfying specific conditions to
>> > data.frame would be a path of low resistance.
>> > Ready to be corrected of course.
>> > On Thu, Apr 23, 2015 at 7:06 PM, Ryan C. Thompson <rct at thompsonclan.org
>> > wrote:
>> > > Hi all,
>> > >
>> > > So, dplyr is a pretty cool thing, but it currently works with
>> > > and data.table, but not S4Vectors::DataFrame. I'd like to change that
>> > > possible, and I assume that this would "simply" involve writing some
>> > > code. However, I'm not really sure where to start, and I expect things
>> > > might be complicated because dplyr uses S3 and S4Vectors uses S4. Can
>> > > anyone offer any pointers?
>> > >
>> > > -Ryan
>> > >
>> > > _______________________________________________
>> > > Bioc-devel at r-project.org mailing list
>> > > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> > >
>> > [[alternative HTML version deleted]]
>> > _______________________________________________
>> > Bioc-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> [[alternative HTML version deleted]]
>> Bioc-devel at r-project.org mailing list
[[alternative HTML version deleted]]
More information about the Bioc-devel