[R] "[.data.frame" and lapply

Wacek Kusnierczyk Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Mon Mar 30 11:40:49 CEST 2009


Bert Gunter wrote:
> Folks:
>
> I do not wish to agree or disagree with the criticisms of either the speed
> or possible design flaws of "[". But let's at least see what the docs say
> about the issues, using the simple example you provided:
>
>
>     m = matrix(1:9, 3, 3)
>     md = data.frame(m)
>
>     md[1]
>     # the first column
> ## as documented. This is because a data frame is a list of 3 identical
> ## length columns, and this is how [ works for lists
>
>     m[1]
>     # the first element (i.e., m[1,1])
> ## as documented. A matrix is just a vector with a dim attribute and 
> ## this is how [ works for vectors
>
>     md[,i=3]
>     # third row
> ## See below
>
>     m[,i=3]
>     # third column
> ##  Correct,as documented in ?"["  for matrices, to whit:
> "Note that these operations do not match their index arguments in the
> standard way: argument names are ignored and positional matching only is
> used. So m[j=2,i=1] is equivalent to m[2,1] and not to m[1,2]. "
>
> ## Note that the next lines immediately following say:
>
> "This may not be true for methods defined for them; for example it is not
> true for the data.frame methods described in [.data.frame. 
>
> To avoid confusion, do not name index arguments (but drop and exact must be
> named). "
>
> So, while it may be fair to characterize the md[,i=3] as a design flaw, it
> is both explicitly pointed out and warned against. Note that,of course
>
> md[,3]
> ## 3rd column, good practice
> md[,j=3]
> ## also 3rd column .. but warned against as bad practice
>
> Whether a behavior should be considered a "bug" if it is explicitly warned
> against in the docs, I leave for others to decide. Too deep for me. 
>   

ok, there may be a point here.  but comments such as the above quotes
from ?'[' provide evidence for that the design is chaotic, with lots of
non-obvious exceptions, explained somewhere there,
please-read-every-single-letter-in-tfm. 

furthermore, what is "This may not be true for methods defined for them"
supposed to tell a user trying to get an understanding of what will
happen if certain constructs are used?  and from what you quote, it
seems that the statement about ignored argument names (i.e., the index
names 'i' and 'j') is *not* applicable to [.data.frame.  it seems quite
clear to me.  and "To avoid confusion, do not name index arguments"
would better specify whose confusion is meant -- apparently, it is r's
implementation that is confused here.

best,
vQ




More information about the R-help mailing list