[Rd] Quiz: How to get a "named column" from a data frame

Sat Aug 18 23:56:13 CEST 2012

That would have been essentially my suggestion as well.  I prefer its clarity
(and speed).  I didn't know if you wanted your solution to also apply
to matrices embedded in data.frames.  In S+ rownames<-() works on vectors
(because it calls the generic rowId<-()) so the following works:
  > f4 <- function(df, column) { tmp <- df[[column]] ; rownames(tmp) <- rownames(df) ; tmp}
  > nv <- c(a=1,d=17,e=101)
  > df <- data.frame(VAR=nv, Two=3^(1:3))
  > f4(df, 2)
   a d  e 
   3 9 27
  > df$Matrix <- matrix(1001:1006, ncol=2, nrow=3)
  > f4(df, "Matrix")
    [,1] [,2] 
  a 1001 1004
  d 1002 1005
  e 1003 1006

I forget if R has something like rowIds() (it is to names and rownames as
NROW is to length and nrow).

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -----Original Message-----
> From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org] On Behalf
> Of Winston Chang
> Sent: Saturday, August 18, 2012 11:54 AM
> To: Martin Maechler
> Cc: R. Devel List
> Subject: Re: [Rd] Quiz: How to get a "named column" from a data frame
> 
> This isn't super-concise, but has the virtue of being clear:
> 
> nv <- c(a=1, d=17, e=101)
> df <- as.data.frame(cbind(VAR = nv))
> 
> identical(nv, setNames(df$VAR, rownames(df)))
> # TRUE
> 
> 
> It seems to be more efficient than the other methods as well:
> 
> f1 <- function() setNames(df$VAR, rownames(df))
> f2 <- function() t(df)[1,]
> f3 <- function() as.matrix(df)[,1]
> 
> r <- microbenchmark(f1(), f2(), f3(), times=1000)
> r
> # Unit: microseconds
> #   expr    min      lq median      uq      max
> # 1 f1() 14.589 17.0315 18.608 19.3220   89.388
> # 2 f2() 68.057 70.8735 72.240 75.8065 3707.012
> # 3 f3() 58.153 61.2600 62.521 65.0380  238.483
> 
> -Winston
> 
> 
> 
> On Sat, Aug 18, 2012 at 10:03 AM, Martin Maechler
> <maechler at stat.math.ethz.ch> wrote:
> > Today, I was looking for an elegant (and efficient) way
> > to get a named (atomic) vector by selecting one column of a data frame.
> > Of course, the vector names must be the rownames of the data frame.
> >
> > Ok, here is the quiz, I know one quite "cute"/"slick" answer, but was
> > wondering if there are obvious better ones, and
> > also if this should not become more idiomatic (hence "R-devel"):
> >
> > Consider this toy example, where the dataframe already has only
> > one column :
> >
> >> nv <- c(a=1, d=17, e=101); nv
> >   a   d   e
> >   1  17 101
> >
> >> df <- as.data.frame(cbind(VAR = nv)); df
> >   VAR
> > a   1
> > d  17
> > e 101
> >
> > Now how, can I get 'nv' back from 'df' ?   I.e., how to get
> >
> >> identical(nv, .......)
> > [1] TRUE
> >
> > where ...... only uses 'df' (and no non-standard R packages)?
> >
> > As said, I know a simple solution (*), but I'm sure it is not
> > obvious to most R users and probably not even to the majority of
> > R-devel readers... OTOH, people like Bill Dunlap will not take
> > long to provide it or a better one.
> >
> > (*) In my solution, the above '.......' consists of 17 letters.
> > I'll post it later today (CEST time) ... or confirm
> > that someone else has done so.
> >
> > Martin
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel