[Rd] By default, `names<-` alters S4 objects

Hervé Pagès hpages at fhcrc.org
Tue May 17 01:13:34 CEST 2011


On 11-05-16 01:53 PM, John Chambers wrote:
>
>
> On 5/16/11 10:09 AM, Hervé Pagès wrote:
>> On 11-05-16 09:36 AM, John Chambers wrote:
>>> You set up a names slot in a non-vector. Maybe that should be allowed,
>>> maybe not. But in any case I would not expect the names() primitive to
>>> find it, because your object has a non-vector type ("S4").
>>
>> But the names<-() primitive *does* find it. So either names() and
>> names<-() should both find it, or they shouldn't. I mean, if you care
>> about consistency and predictability of course.
>
> That's not the only case where borderline or mistaken behavior is caught
> on assignment, but not on access. The argument is that assignment can
> afford to check things, but access needs to be fast. Slot access is
> another case. There, assignment ensures legality so access can be quick.
>
> The catch is that there are sometimes backdoor ways to assignments,
> partly because slots, attributes and some "builtin" properties like
> names overlap.
>
> What we were talking about before was trying to evolve a sensible rule
> for assigning names to S4 objects. Let's try to discuss what people need
> to do before carping or indulging in sarcasm.

What *you* were talking about but not what my original post was about.
Anyway, about the following proposal:

    1.  If the class has a vector data slot and no names slot, assign 
the names but with a warning.

    2. Otherwise, throw an error.

   (I.e., I would prefer an error throughout, but discretion ....)

I personally don't like it because it breaks inheritance. Let's
say I have a class B with a vector data slot and no names slot.
According to 1. names<-() would work out-of-the-box on it (with
a warning), but now if I extend it by adding a names slot, it
breaks.

One thing to consider though is that this works right now (and with
no warning):

   > setClass("I", contains="integer")
   [1] "I"
   > i <- new("I", 1:4)
   > names(i) <- LETTERS[1:4]
   > attributes(i)
   $class
   [1] "I"
   attr(,"package")
   [1] ".GlobalEnv"

   $names
   [1] "A" "B" "C" "D"

   > names(i)
   [1] "A" "B" "C" "D"

and it's probably what most people would expect (sounds reasonable
after all). So this needs to keep working (with no warning). I can
see 2 ways to avoid breaking inheritance:

   (a) not allow a names slot to be added to class I or any
       of its subclasses (in other words the .Data and names
       slots cannot coexist),
or
   (b) have names() and names<-() keep working when the names slot is
       added but that is maybe dangerous as it might break C code that
       is trying to access the names, that is, inheritance might break
       but now at the C level

Now for classes that don't have a .Data slot, they can of course
have a names slot. I don't have a strong opinion on whether names()
and names<-() should access it by default, but honestly that's really
a very small convenience offered to the developer of the class. Also,
for the sake of consistency, the same would need to be done for dim,
dimnames and built-in attributes in general. And also that won't work
if those built-in-attributes-made-slots are not declared with the right
type in the setClass statement (i.e. "character" for names, "integer"
for dim, etc...). And also by default names() would return character(0)
and not NULL. So in the end, potentially a lot of complications /
surprise / inconsistencies for very little value.

Thanks,
H.

>
> John
>
>>
>> H.
>>
>>
>>> You could do
>>> a at names if you thought that made sense:
>>>
>>>
>>> > setClass("A", representation(names="character"))
>>> [1] "A"
>>> > a <- new("A")
>>> > a at names <- "xx"
>>> > a at names
>>> [1] "xx"
>>> > names(a)
>>> NULL
>>>
>>>
>>> If you wanted something sensible, it's more like:
>>>
>>> > setClass("B", representation(names = "character"), contains =
>>> "integer")
>>> [1] "B"
>>> > b <- new("B", 1:5)
>>> > names(b) <- letters[1:5]
>>> > b
>>> An object of class "B"
>>> [1] 1 2 3 4 5
>>> Slot "names":
>>> [1] "a" "b" "c" "d" "e"
>>>
>>> > names(b)
>>> [1] "a" "b" "c" "d" "e"
>>>
>>> This allows both the S4 and the primitive code to deal with a
>>> well-defined object.
>>>
>>> John
>>>
>>>
>>> On 5/15/11 3:02 PM, Hervé Pagès wrote:
>>>> On 11-05-15 11:33 AM, John Chambers wrote:
>>>>> This is basically a case of a user error that is not being caught:
>>>>
>>>> Sure!
>>>>
>>>> https://stat.ethz.ch/pipermail/r-devel/2009-March/052386.html
>>> ......
>>>
>>>>
>>>> Ah, that's interesting. I didn't know I could put a names slot in my
>>>> class. Last time I tried was at least 3 years ago and that was causing
>>>> problems (don't remember the exact details) so I ended up using NAMES
>>>> instead. Trying again with R-2.14:
>>>>
>>>> > setClass("A", representation(names="character"))
>>>>
>>>> > a <- new("A")
>>>>
>>>> > attributes(a)
>>>> $names
>>>> character(0)
>>>>
>>>> $class
>>>> [1] "A"
>>>> attr(,"package")
>>>> [1] ".GlobalEnv"
>>>>
>>>> > names(a)
>>>> NULL
>>>>
>>>> > names(a) <- "K"
>>>>
>>>> > attributes(a)
>>>> $names
>>>> [1] "K"
>>>>
>>>> $class
>>>> [1] "A"
>>>> attr(,"package")
>>>> [1] ".GlobalEnv"
>>>>
>>>> > names(a)
>>>> NULL
>>>>
>>>> Surprise! But that's another story...
>>>>
>>>>>
>>>>> The modification that would make sense would be to give you an
>>>>> error in
>>>>> the above code. Not a bad idea, but it's likely to generate more
>>>>> complaints in other contexts, particularly where people don't
>>>>> distinguish the "list" class from lists with names (the "namedList"
>>>>> class).
>>>>>
>>>>> A plausible strategy:
>>>>> 1. If the class has a vector data slot and no names slot, assign the
>>>>> names but with a warning.
>>>>>
>>>>> 2. Otherwise, throw an error.
>>>>>
>>>>> (I.e., I would prefer an error throughout, but discretion ....)
>>>>
>>>> Or, at a minimum (if no consensus can be reached about the above
>>>> strategy), not add a "names" attribute set to NULL. My original
>>>> post was more about keeping the internal representation of objects
>>>> "normalized", in general, so identical() is more likely to be
>>>> meaningful.
>>>>
>>>> Thanks,
>>>> H.
>>>>
>>>>>
>>>>> Comments?
>>>>>
>>>>> John
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> H.
>>>>>>
>>>>>>
>>>>>
>>>>> ______________________________________________
>>>>> R-devel at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>>>
>>
>>


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-devel mailing list