[R] why data.frame, mutate package and not lists

Duncan Murdoch murdoch.duncan at gmail.com
Wed Sep 14 20:54:25 CEST 2016


On 14/09/2016 2:40 PM, jeremiah rounds wrote:
> "If you want to add variable to data.frame you have to use attach, detach.
> Right?"
>
> Not quite.  Use it like a list to add a variable to a data.frame
>
> e.g.
> df = list()
> df$var1 = 1:10
> df = as.data.frame(df)
> df$var2 = 1:10
> df[["var3"]] = 1:10
> df
> df = as.list(df)
> df$var4 = 1:10
> as.data.frame(df)
>
> Ironically the primary reason to use a data.frame in my head is to signal
> that you are thinking of your data as a row-oriented tabular storage.
>   "Ironic" because in technical detail that is not a requirement to be a
> data.frame, but when I reflect on the typical way a seasoned R programmer
> approaches list and data.frames that is basically what they are
> communicating.

I believe it is intended to be a requirement.  You can construct things 
with class "data.frame" that don't have that structure, but lots of 
stuff will go wrong if you do.

Duncan Murdoch
>
> I was going to post that a reason to use data.frames is to take advantages
> of optimizations and syntax sugar for data.frames, but in reality if code
> does not assume a row-oriented data structure in a data.frame there is not
> much I can think of that exists in the way of optimization.  For example,
> we could point to "subset" and say that is a reason to use data.frames and
> not list, but that only works if you use data.frame in a conventional way.
>
> In the end, my advice to you is if it is a table make it a data.frame and
> if it is not easily thought of as a table or row-oriented data structure
> keep it as a list.
>
> Thanks,
> Jeremiah
>
>
>
>
>
> On Wed, Sep 14, 2016 at 11:15 AM, Alaios via R-help <r-help at r-project.org>
> wrote:
>
> > thanks for all the answers. I think also ggplot2 requires data.frames.If
> > you want to add variable to data.frame you have to use attach, detach.
> > Right?Any more links that discuss thoe two different approaches?Alex
> >
> >     On Wednesday, September 14, 2016 5:34 PM, Bert Gunter <
> > bgunter.4567 at gmail.com> wrote:
> >
> >
> >  This is partially a matter of subjectve opinion, and so pointless; but
> > I would point out that data frames are the canonical structure for a
> > great many of R's modeling and graphics functions, e.g. lm, xyplot,
> > etc.
> >
> > As for mutate() etc., that's about UI's and user friendliness, and
> > imho my ho is meaningless.
> >
> > Best,
> > Bert
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> > and sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help <r-help at r-project.org>
> > wrote:
> > > Hi all,I have seen data.frames and operations from the mutate package
> > getting really popular. In the last years I have been using extensively
> > lists, is there any reason to not use lists and use other data types for
> > data manipulation and storage?
> > > Any article that describe their differences? I would like to thank you
> > for your replyRegardsAlex
> > >        [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list