[Rd] Deprecating partial matching in $.data.frame

Hervé Pagès hpages at fhcrc.org
Fri Mar 22 05:57:58 CET 2013


Hi,

Maybe a compromise would be to just issue a warning without
deprecating? That way people who want to do anova(fit1)$P can
still do it. When working interactively, it's certainly convenient
(serious code however should probably stay away from partial matching).

And so you keep the semantic consistent with lists because yes,
consistency is important. data.frame inherits from list so any
operation that works on a list is expected to work on a data.frame,
preferably the same way (otherwise it will always be a BIG surprise
to the user/programmer). For example if I have to maintain someone
else code and see something like:

     bar <- x$bar

and I know that 'x' is a list that contains atomic vectors of the
same length, I could have some good reasons to want to use a
data.frame instead of a list. And I would assume it's safe to
modify the code by adding the following line earlier in it:

    x <- as.data.frame(x)

But with the proposed change to $.data.frame, I cannot make this
kind of assumption anymore...

My two cents

H.


On 03/21/2013 06:52 AM, peter dalgaard wrote:
>
> On Mar 21, 2013, at 09:25 , Rainer M Krug wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> On 20/03/13 17:58, Hadley Wickham wrote:
>>> On Wed, Mar 20, 2013 at 11:26 AM, peter dalgaard <pdalgd at gmail.com> wrote:
>>>>
>>>> On Mar 20, 2013, at 16:59 , William Dunlap wrote:
>>>>
>>>>> Will you be doing the same for attribute names?
>>>>
>>>> Not at this point.
>>>
>>> It would be really nice to have consistent behaviour across argument names, attributes, lists
>>> and data frames, at least for R CMD check.
>>
>> I agree with Hadley that consistency is quite important. This is especially true for data.frames
>> and lists, as this concerns the data itself, and not names or attributes of the data.
>
> Well, maybe consistency is important, but partial matching never worked for $-extraction in environments, so the current change could be considered mainly a nudge of data frames in the direction of environments. After all, both can be thought of as collections of named objects.
>
> General lists are a somewhat different issue. They often, formally or informally, represent classed objects with a defined set of names, typically obtained as return values from functions. Since the names are known, people will have used the expedient of abbreviating them. This can happen with data frames as well, but less commonly, since it is in general unsafe to rely on column names being uniquely defined by any particular prefix.
>
> I.e., deprecating partial matching for lists opens a rather larger can of worms, and might require more extensive code revisions. Also, the performance hit of a runtime check for partial matching might be more important for lists than it is for data frames. It could be worth it to implement an R CMD check warning as you suggest, but perhaps not just now.
>
> -Peter
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-devel mailing list