[Rd] Documentation request Re: [R] recode categorial vars into binary data
Martin Maechler
maechler at stat.math.ethz.ch
Tue May 14 17:44:43 CEST 2013
>>>>> David Winsemius <dwinsemius at comcast.net>
>>>>> on Mon, 13 May 2013 10:21:33 -0700 writes:
> On May 7, 2013, at 10:54 AM, Chris Stubben wrote:
>>
>>> First off, stop using cbind() when it is not needed. You will not see the reason when the columns are all numeric but you will start experiencing pain and puzzlement when the arguments are of mixed classes. The data.frame function will do what you want. (Where do people pick up this practice anyway?)
I had asked the same (in the past)...
and you guess a probable answer below.
>> Maybe from help( data.frame)?
>>
>> It's in most of the examples and is not needed ...
>>
>> L3 <- LETTERS[1:3]
>> (d <- data.frame(cbind(x=1, y=1:10), fac=sample(L3, 10, replace=TRUE)))
>> ## The same with automatic column names:
>> data.frame(cbind( 1, 1:10), sample(L3, 10, replace=TRUE))
>>
>> Chris
> There are many instances of new users posting questions to R-help where they use the form:
> dfrm <- data.frame(cbind(1:10, letter[1:10]) )
> … and predictably get a character mode for all their columns. I was pointed to the help page for `data.frame` as one possible source of this confusion. I would like to request that the examples be changed to:
> L3 <- LETTERS[1:3]
> (d <- data.frame(x = 1, y = 1:10, fac = sample(L3, 10, replace = TRUE)))
> ## The same with automatic column names:
> data.frame( 1, 1:10, sample(L3, 10, replace = TRUE))
Very good suggestion.... notably if your guess was right !
Unfortunately, this cannot make it into 3.0.1 (the examples are
*run* etc... to much for "deep code freeze" we are in now).
But I plan to backport the change to "3.0.1 patched" once that
is released...
all in the big hope that people will *STOP* using
data.frame( cbind( ... ) )
in a habitual way.
Martin
> --
> David Winsemius
> Alameda, CA, USA
PS: Are there other suggestions to help people *stop* using
ifelse(A, B, C)
in those many places where they should use
if(A) B else C
?
More information about the R-devel
mailing list