[Rd] Another incorrect behaviour of [.data.frame (PR#13629)

waku at idi.ntnu.no waku at idi.ntnu.no
Sun Mar 29 01:20:11 CET 2009


Full_Name: Wacek Kusnierczyk
Version: 2.8.0 and 2.10.0 r48231
OS: Ubuntu 8.04 Linux 32 bit
Submission from: (NULL) (129.241.198.65)


In a previous report (Incorrect behaviour of [.data.frame (PR#13628), awaiting
approval) I showed that [.data.frame behaves incorrectly (i.e., in contradiction
to what ?'[.data.frame' and the R Language Definition say):

   d = data.frame(a=1:3, b=4:6)
   d[j=2]
   # returns the whole data frame, not just the second column

There appears to be one more issue with [.data.frame:

   d[x=1]
   # a list, not a data frame

and also

   d[x=1:2]
   # returns a *list* with the contents of the two columns of d

The argument 'x' is a legal argument to [.data.frame.  What happens here is that
[.data.frame receives 1:2 as the argument 'x' (because of name-based argument
matching) and d as the argument 'i' (because of subsequent positional argument
matching).  When [.data.frame calls NextMethod('['), a list is returned, and
then [.data.frame wraps the list into a structure as follows:

   return(structure(y, class = oldClass(x), 
      row.names = .row_names_info(x, 0L)))

(src/library/base/r/dataframe.R:531-532)  since x is 1:2 and oldClass(1:2) is
NULL, the structure is a list and not a data frame, and thus the final result.

The result is clearly incorrect wrt. the documentation:  

- there should be no dimension dropping when only one index is given (even if
drop=TRUE; the list-like indexing d[i], line 507 in the source);

- there should be no dimension dropping if more than one column is selected.

Furthermore, even the fact that d[x=1:2] succeeds is surprising:  [.data.frame
should try to select from 1:2 using d as an index, which should fail.  It seems
that the call to NextMethod incorrectly matches the arguments, and receives the
first (unnamed) argument of [.data.frame (which is d) as the data frame and the
second argument (named 'x', which is 1:2) as the index, and returns a list of
columns instead of raising an error.

Regards,
vQ



More information about the R-devel mailing list