[Rd] Quiz: How to get a "named column" from a data frame

J. R. M. Hosking JRMHosking at gmail.com
Sat Aug 18 21:28:30 CEST 2012


On 2012-08-18 11:03, Martin Maechler wrote:
> Today, I was looking for an elegant (and efficient) way
> to get a named (atomic) vector by selecting one column of a data frame.
> Of course, the vector names must be the rownames of the data frame.
>
> Ok, here is the quiz, I know one quite "cute"/"slick" answer, but was
> wondering if there are obvious better ones, and
> also if this should not become more idiomatic (hence "R-devel"):
>
> Consider this toy example, where the dataframe already has only
> one column :
>
>> nv<- c(a=1, d=17, e=101); nv
>    a   d   e
>    1  17 101
>
>> df<- as.data.frame(cbind(VAR = nv)); df
>    VAR
> a   1
> d  17
> e 101
>
> Now how, can I get 'nv' back from 'df' ?   I.e., how to get
>
>> identical(nv, .......)
> [1] TRUE
>
> where ...... only uses 'df' (and no non-standard R packages)?
>
> As said, I know a simple solution (*), but I'm sure it is not
> obvious to most R users and probably not even to the majority of
> R-devel readers... OTOH, people like Bill Dunlap will not take
> long to provide it or a better one.
>
> (*) In my solution, the above '.......' consists of 17 letters.
> I'll post it later today (CEST time) ... or confirm
> that someone else has done so.
>
> Martin

For this purpose my private function library has a function withnames():

withnames(): Extract from data frame as a named vector

Description: Extracts data from a data frame; if the result is a vector
(i.e. we extracted a single column and did not specify 'drop=FALSE')
it is assigned names derived from the row names of the data frame.

Usage: withnames(expr)

Arguments: expr: R expression.

Details: 'expr' is evaluated in an environment in which the extractor
functions '$.data.frame', '[.data.frame', and '[[.data.frame' are
replaced by versions that attach the data frame's row names to an
extracted vector.

Value: 'expr', evaluated as described above.

## Code

withnames<-function(expr) {
   eval(substitute(expr),
   list(
     `[.data.frame` = function(x,i,...) {
       out<-x[i,...]
       if (is.null(dim(out))) names(out)<-row.names(x)[i]
       return(out)},
     `[[.data.frame` = function(x,...) {
       out<-x[[...]]
       if (is.null(dim(out))) names(out)<-row.names(x)
       return(out)},
     `$.data.frame` = function(x,name) {
       out<-x[[name, exact=FALSE]]
       if (is.null(dim(out))) names(out)<-row.names(x)
       return(out)}
     ),
   enclos=parent.frame())
}

## Examples

dd <- data.frame(aa=1:6, bb=letters[c(1,3,2,3,3,1)],
   row.names=LETTERS[1:6])
dd
dd$aa                          # Unnamed vector
withnames(dd$aa)               # Named vector
withnames(dd[["aa"]])          # Named vector
withnames(dd[2:4,"aa"])        # Named vector
withnames(dd$bb)               # Factor with names
withnames(outer(dd$a,dd$a))    # Both dimensions have names

## But now I am looking for a version that will play nicely with with():

withnames(with(dd, aa))  # No names!
with(dd, withnames(aa))  # No names!



More information about the R-devel mailing list