[R] How can I improve an ugly, dumb hack
David Winsemius
dwinsemius at comcast.net
Thu Sep 6 19:20:09 CEST 2012
On Sep 6, 2012, at 9:03 AM, Bert Gunter wrote:
> Hi Folks:
>
> Here's the situation:
>
>> m <- cbind(x=letters[1:3], y = letters[4:6])
>> m
> x y
> [1,] "a" "d"
> [2,] "b" "e"
> [3,] "c" "f"
>
> ## m is a 2 column character matrix
>
>> d <- data.frame(a=1:3,b=4:6)
>> d$c <- m
>> d
> a b c.x c.y
> 1 1 4 a d
> 2 2 5 b e
> 3 3 6 c f
>
> ## But please note (as was remarked in a thread here a couple of months ago)
>> ncol(d)
> [1] 3
>
> ## d is a ** 3 ** column data frame
I guess this means you are not the one performing the d$c <- m step? If you were under control of that step, you can get different (and more to your liking) behavior with 'cbind.data.frame':
> cbind(d, m)
a b x y
1 1 4 a d
2 2 5 b e
3 3 6 c f
> ncol( cbind(d, m) )
[1] 4
>
> Now what I wish to do is programmatically convert d to a 4 column
> frame with names c("a","b","x","y"). Of course:
>
> 1. The column classes/modes must be preserved (character going to
> factor and numeric remaining numeric).
>
> 2. I assume that I do not know a priori which of d's
> components/columns are matrices and which are vectors.
>
> 3. There may be many more columns which are vectors or matrix than
> just the three in this little example.
>
> I can easily and sensibly accomplish these 3 tasks, but the problem is
> that I run afoul of data frame column naming procedures in doing so,
> about which the data.frame Help page says rather enigmatically:
>
> "How the names of the data frame are created is complex, and the rest
> of this paragraph is only the basic story." Indeed!
> (This, of course, is shorthand for "Go look at the source if you want
> to know!" )
>
> Anyway, AFAICT from the Help, any "simple" approach to conversion
> using data.frame results in "c.x" and "c.y" for the names of the last
> two columns. I **can** get what I want by explicitly constructing the
> vector of names via the following ugly hack; my question is, can it be
> improved?
>
>> dd <- do.call(data.frame,d)
>
>> dd
> a b c.x c.y
> 1 1 4 a d
> 2 2 5 b e
> 3 3 6 c f
>
>> ncol(dd)
> [1] 4
>
>> cnames <- sapply(d,colnames)
>> cnames
> $a
> NULL
>
> $b
> NULL
>
> $c
> [1] "x" "y"
>
>
>> names(dd) <- unlist(ifelse(sapply(cnames,is.null),names(d),cnames))
> ##Yuck!
>
>> dd
> a b x y
> 1 1 4 a d
> 2 2 5 b e
> 3 3 6 c f
>
> Cheers to all,
> Bert
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Alameda, CA, USA
More information about the R-help
mailing list