[R] vectorization question
Martin Maechler
maechler at stat.math.ethz.ch
Fri Aug 15 10:44:31 CEST 2003
>>>>> "Tony" == Tony Plate <tplate at blackmesacapital.com>
>>>>> on Thu, 14 Aug 2003 11:43:11 -0600 writes:
Tony> From ?data.frame:
>> Details:
>>
>> A data frame is a list of variables of the same length with unique
>> row names, given class `"data.frame"'.
Tony> Your example constructs an object that does not
Tony> conform to the definition of a data frame (the new
Tony> column is not the same length as the old columns).
Tony> Some data frame functions may work OK with such an
Tony> object, but others will not. For example, the print
Tony> function for data.frame silently handles such an
Tony> illegal data frame (which could be described as
Tony> unfortunate.) It would probably be far easier to
Tony> construct a correct data frame in the first place than
Tony> to try to find and fix functions that don't handle
Tony> illegal data frames. For adding a new column to a
Tony> data frame, the expressions "x[,new.column.name] <-
Tony> value" and "x[[new.column.name]] <- value" will
Tony> replicate the value so that the new column is the same
Tony> length as the existing ones, while the "$" operator in
Tony> an assignment will not replicate the value. (One
Tony> could argue that this is a deficiency, but I think it
Tony> has been that way for a long time, and the behavior is
Tony> the same in the current version of S-plus.)
>> x1 <- data.frame(a=1:3)
>> x2 <- x1
>> x3 <- x1
>> x1$b <- 0
>> x2[,"b"] <- 0
>> x3[["b"]] <- 0
>> sapply(x1, length)
Tony> a b
Tony> 3 1
>> sapply(x2, length)
Tony> a b
Tony> 3 3
>> sapply(x3, length)
Tony> a b
Tony> 3 3
>> as.matrix(x2)
Tony> a b
Tony> 1 1 0
Tony> 2 2 0
Tony> 3 3 0
>> as.matrix(x1)
Tony> Error in as.matrix.data.frame(x1) : dim<- length of dims do not match the
Tony> length of object
Thank you, Tony. This certainly was the most precise
explanation on this thread.
Everyone note however, that this has been improved (by Brian
Ripley) in the current R-devel {which should be come R 1.8 in October}.
There, also "$<-" assignment of data frames does check things
and in this case will do the same replication as the [,] or [[]]
assignments do.
For back compatibility (with S-plus and earlier R versions), I'd
still recommend using bracket "[" rather than "$" assignment for
data frames.
Martin Maechler <maechler at stat.math.ethz.ch> http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27
ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND
phone: x-41-1-632-3408 fax: ...-1228 <><
More information about the R-help
mailing list