[R] How can I improve an ugly, dumb hack

Thu Sep 6 19:20:09 CEST 2012

On Sep 6, 2012, at 9:03 AM, Bert Gunter wrote:

> Hi Folks:
> 
> Here's the situation:
> 
>> m <- cbind(x=letters[1:3], y = letters[4:6])
>> m
>     x   y
> [1,] "a" "d"
> [2,] "b" "e"
> [3,] "c" "f"
> 
> ## m is a 2 column character matrix
> 
>> d <- data.frame(a=1:3,b=4:6)
>> d$c <- m
>> d
>  a b c.x c.y
> 1 1 4   a   d
> 2 2 5   b   e
> 3 3 6   c   f
> 
> ## But please note (as was remarked in a thread here a couple of months ago)
>> ncol(d)
> [1] 3
> 
> ## d is a ** 3 ** column data frame

I guess this means you are not the one performing the d$c <- m step? If you were under control of that step, you can get different (and more to your liking)  behavior with 'cbind.data.frame':

> cbind(d, m)
  a b x y
1 1 4 a d
2 2 5 b e
3 3 6 c f
> ncol( cbind(d, m) )
[1] 4

> 
> Now what I wish to do is programmatically convert d to a 4 column
> frame with names c("a","b","x","y"). Of course:
> 
> 1. The column classes/modes must be preserved (character going to
> factor and numeric remaining numeric).
> 
> 2. I assume that I do not know a priori which of d's
> components/columns are matrices and which are vectors.
> 
> 3. There may be many more columns which are vectors or matrix than
> just the three in this little example.
> 
> I can easily and sensibly accomplish these 3 tasks, but the problem is
> that I run afoul of data frame column naming procedures in doing so,
> about which the data.frame Help page says rather enigmatically:
> 
> "How the names of the data frame are created is complex, and the rest
> of this paragraph is only the basic story." Indeed!
> (This, of course, is shorthand for "Go look at the source if you want
> to know!" )
> 
> Anyway, AFAICT from the Help, any "simple" approach to conversion
> using data.frame results in "c.x" and "c.y" for the names of the last
> two columns. I **can** get what I want by explicitly constructing the
> vector of names via the following ugly hack; my question is, can it be
> improved?
> 
>> dd <- do.call(data.frame,d)
> 
>> dd
>  a b c.x c.y
> 1 1 4   a   d
> 2 2 5   b   e
> 3 3 6   c   f
> 
>> ncol(dd)
> [1] 4
> 
>> cnames <- sapply(d,colnames)
>> cnames
> $a
> NULL
> 
> $b
> NULL
> 
> $c
> [1] "x" "y"
> 
> 
>> names(dd) <-  unlist(ifelse(sapply(cnames,is.null),names(d),cnames))
>  ##Yuck!
> 
>> dd
>  a b x y
> 1 1 4 a d
> 2 2 5 b e
> 3 3 6 c f
> 
> Cheers to all,
> Bert
> 
> 
> -- 
> 
> Bert Gunter
> Genentech Nonclinical Biostatistics
> 
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Alameda, CA, USA