[Rd] Quiz: How to get a "named column" from a data frame
William Dunlap
wdunlap at tibco.com
Sat Aug 18 23:56:13 CEST 2012
That would have been essentially my suggestion as well. I prefer its clarity
(and speed). I didn't know if you wanted your solution to also apply
to matrices embedded in data.frames. In S+ rownames<-() works on vectors
(because it calls the generic rowId<-()) so the following works:
> f4 <- function(df, column) { tmp <- df[[column]] ; rownames(tmp) <- rownames(df) ; tmp}
> nv <- c(a=1,d=17,e=101)
> df <- data.frame(VAR=nv, Two=3^(1:3))
> f4(df, 2)
a d e
3 9 27
> df$Matrix <- matrix(1001:1006, ncol=2, nrow=3)
> f4(df, "Matrix")
[,1] [,2]
a 1001 1004
d 1002 1005
e 1003 1006
I forget if R has something like rowIds() (it is to names and rownames as
NROW is to length and nrow).
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org] On Behalf
> Of Winston Chang
> Sent: Saturday, August 18, 2012 11:54 AM
> To: Martin Maechler
> Cc: R. Devel List
> Subject: Re: [Rd] Quiz: How to get a "named column" from a data frame
>
> This isn't super-concise, but has the virtue of being clear:
>
> nv <- c(a=1, d=17, e=101)
> df <- as.data.frame(cbind(VAR = nv))
>
> identical(nv, setNames(df$VAR, rownames(df)))
> # TRUE
>
>
> It seems to be more efficient than the other methods as well:
>
> f1 <- function() setNames(df$VAR, rownames(df))
> f2 <- function() t(df)[1,]
> f3 <- function() as.matrix(df)[,1]
>
> r <- microbenchmark(f1(), f2(), f3(), times=1000)
> r
> # Unit: microseconds
> # expr min lq median uq max
> # 1 f1() 14.589 17.0315 18.608 19.3220 89.388
> # 2 f2() 68.057 70.8735 72.240 75.8065 3707.012
> # 3 f3() 58.153 61.2600 62.521 65.0380 238.483
>
> -Winston
>
>
>
> On Sat, Aug 18, 2012 at 10:03 AM, Martin Maechler
> <maechler at stat.math.ethz.ch> wrote:
> > Today, I was looking for an elegant (and efficient) way
> > to get a named (atomic) vector by selecting one column of a data frame.
> > Of course, the vector names must be the rownames of the data frame.
> >
> > Ok, here is the quiz, I know one quite "cute"/"slick" answer, but was
> > wondering if there are obvious better ones, and
> > also if this should not become more idiomatic (hence "R-devel"):
> >
> > Consider this toy example, where the dataframe already has only
> > one column :
> >
> >> nv <- c(a=1, d=17, e=101); nv
> > a d e
> > 1 17 101
> >
> >> df <- as.data.frame(cbind(VAR = nv)); df
> > VAR
> > a 1
> > d 17
> > e 101
> >
> > Now how, can I get 'nv' back from 'df' ? I.e., how to get
> >
> >> identical(nv, .......)
> > [1] TRUE
> >
> > where ...... only uses 'df' (and no non-standard R packages)?
> >
> > As said, I know a simple solution (*), but I'm sure it is not
> > obvious to most R users and probably not even to the majority of
> > R-devel readers... OTOH, people like Bill Dunlap will not take
> > long to provide it or a better one.
> >
> > (*) In my solution, the above '.......' consists of 17 letters.
> > I'll post it later today (CEST time) ... or confirm
> > that someone else has done so.
> >
> > Martin
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list