[R] poly objects as data frame columns

Ulrike Grömping groemp at tfh-berlin.de
Fri Jul 17 23:25:31 CEST 2009



David Winsemius wrote:
> 
> 
> On Jul 17, 2009, at 3:24 PM, Ulrike Grömping wrote:
> 
>>
>> David,
>>
>> thanks. Your explanation does not quite fit, though, as it refers to  
>> using
>> function data.frame, while I assigned the new column with $<-.  
>> poly() does
>> return an object of classes poly and matrix, not model.matrix,
> 
> But model.matrix is not a class as far as I can tell. It has no  
> "is.<>" function, and examining a sample model matrix does not  
> indicate that it carries a special class attribute.
> 
It is a class all right, but is apparently not per default assigned to
objects generated with function model.matrix. Try
mm <- model.matrix(lm(swiss))
str(data.frame(swiss,mm))
class(mm) <- c("model.matrix","matrix")
str(data.frame(swiss,mm))


David Winsemius wrote:
> 
>>...
>> It is just the assignment with "$" that does behave differently -  
>> and not
>> only for poly objects but for any matrix object. After I eventually
>> remembered how to get to the documentation of extractors
>> (?"$<-.data.frame"), I found this behavior documented there in the  
>> section
>> on Coercion. Nevertheless, this does seem to contradict the  
>> understanding of
>> what a data frame is. I am aware that data frames are lists, but  
>> they are of
>> course special lists, requiring that all list elements have the same  
>> number
>> of rows. So far I thought that all list elements also have the same  
>> number
>> of columns, namely just one. In fact, the documentation of function
>> data.frame states that
>>
>> "A data frame is a list of variables of the same length with unique  
>> row
>> names, given class "data.frame".",
>>
>> which would imply such a rule.
> 
> Except that the same page asserts:
> 
> "Note that when the replacement value is an array (including a matrix)  
> it is not treated as a series of columns (as data.frame and  
> as.data.frame do) but inserted as a single column."
> 
This is the piece on coercion in the extract documentation I was also
referring to. 


David Winsemius wrote:
> 
> ... which is more on point documentation than what I offered earlier.  
> I also found that the <-I() construct within the data.frame()  would  
> replicate the behavior of df$x<-<mtx> (as was documented in  
> data.frame's help:
>  > dat2 <- data.frame(X1=1:10, X2=LETTERS[1:10], X1poly <- I(poly(dat 
> $X1,3)) )
>  > length(dat2)
> [1] 3
>  > dat2[1,3]
>                1        2          3
> [1,] -0.4954337 0.522233 -0.4534252
> attr(,"class")
> [1] "poly"   "matrix"
>> The possibility of a matrix with more than
>> one column being a column of the data frame contradicts this piece of
>> documentation, since the length of the matrix is not the same as the  
>> length
>> of the other columns (e.g. length(poly(dat$X1,3) is 30, not 10 like  
>> for the
>> other variables). Or would one consider the columns of the matrix  
>> X1poly the
>> variables, but X1poly a column ? I'm not trying to be difficult, I  
>> just find
>> this quite confusing and wonder about the consequences when using  
>> such a
>> data frame in analyses.
> 
> The could be unforeseen consequences, but I am not the right person to  
> answer for all of those possibilities. I can see another instance  
> where it would be desirable to have tuples included in data.frames as  
> arrays and that is in the representation of complex numbers, but it  
> appears that the internal representation of complex numbers is more  
> completely hidden from casual view than is the capacity of data.frames  
> to carry matrices. If you have a compelling argument to change the  
> behavior of [<-.data.frame, you will need to take it up with the  
> developers.
> 
I have no idea which behavior is more useful; also, if this behavior has
been 
around for a long time, changing it would presumably break some code. I
suppose I would just opt for clearer documentation of the data frame class.
The bugs interface is currently down, I may file a documentation wish or
documentation bug later.

Best regards, Ulrike


-- 
View this message in context: http://www.nabble.com/poly-objects-as-data-frame-columns-tp24538067p24541925.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list