[R] idiom for constructing data frame

William Dunlap wdunlap at tibco.com
Fri Apr 3 16:46:34 CEST 2015


> but wouldn't it be more to the point to do
>
> df <- as.data.frame(rep(list(rep(NA_real_, 10)),3))
> names(df) <- names

As a matter of personal style (and functional programming
sensibility), I prefer not to make named objects and then modify them.
Also, the names coming out of that as.data.frame call are exceedingly
ugly and I'd rather not generate them at all.

Also adding the names after calling data.frame means can give
different results than passing them into data.frame(), which can
mangle nonsyntactic names like "Second Name" into "Second.Name".
It is often preferable, but it is different.



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Apr 3, 2015 at 5:51 AM, peter dalgaard <pdalgd at gmail.com> wrote:

>
> > On 31 Mar 2015, at 20:55 , William Dunlap <wdunlap at tibco.com> wrote:
> >
> > You can use structure() to attach the names to a list that is input to
> > data.frame.
> > E.g.,
> >
> > dfNames <- c("First", "Second Name")
> > data.frame(lapply(structure(dfNames, names=dfNames),
> > function(name)rep(NA_real_, 5)))
> >
>
> Yes, I cooked up something similar:
>
> names <- c("foo","bar","baz")
> names(names) <- names # confuse 'em....
> as.data.frame(lapply(names, function(x) rep(NA_real_,10)))
>
> but wouldn't it be more to the point to do
>
> df <- as.data.frame(rep(list(rep(NA_real_, 10)),3))
> names(df) <- names
>
> ?
>
> The lapply() approach could be generalized to a vector of column classes,
> though.
>
> A general solution looks impracticable; once you start considering how to
> specify factor columns with each their own level set, things get a bit out
> of hand.
>
> -pd
>
> >
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com
> >
> > On Tue, Mar 31, 2015 at 11:37 AM, Sarah Goslee <sarah.goslee at gmail.com>
> > wrote:
> >
> >> Hi,
> >>
> >> Duncan Murdoch suggested:
> >>
> >>> The matrix() function has a dimnames argument, so you could do this:
> >>>
> >>> names <- c("strat", "id", "pid")
> >>> data.frame(matrix(NA, nrow=10, ncol=3, dimnames=list(NULL, names)))
> >>
> >> That's a definite improvement, thanks. But no way to skip matrix()? It
> >> just seems unRlike, although since it's only full of NA values there
> >> are no coercion issues with column types or anything, so it doesn't
> >> hurt. It's just inelegant. :)
> >>
> >> Sarah
> >> --
> >> Sarah Goslee
> >> http://www.functionaldiversity.org
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>
>
>
>
>
>
>
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list