[Rd] By default, `names<-` alters S4 objects

Hervé Pagès hpages at fhcrc.org
Tue May 17 23:39:23 CEST 2011


On 11-05-17 01:15 PM, John Chambers wrote:
>
>
> On 5/17/11 9:53 AM, Hervé Pagès wrote:
>> On 11-05-17 09:04 AM, John Chambers wrote:
>>> One point that may have been unclear, though it's surprising if so. The
>>> discussion was about assigning names to S4 objects from classes that do
>>> NOT have a formal "names" slot. Of course, having a "names" slot is not
>>> illegal, it's what one should do to deal with names in S4.
>>
>> IMO it looks more like what one should avoid to do right now because
>> it's broken (as reported previously):
>>
>> > setClass("A", representation(names="character"))
>> > a <- new("A")
>> > names(a) <- "K"
>> > names(a)
>> NULL
>>
>> And on that particular issue here is what you said:
>>
>> You set up a names slot in a non-vector. Maybe that should be
>> allowed, maybe not.
>>
>> And now:
>>
>> Of course, having a "names" slot is not illegal, it's what one
>> should do to deal with names in S4.
>>
>> ??!]
>
> Good grief. The classes like namedList _are_ vectors, that's the point.
>
> Anyway, this is a waste of time. I will add some code to r-devel that
> checks S4 objects when assigning names. People can try it out on their
> examples.

Thanks!

H.

>
>>
>> H.
>>
>>
>>> Look at class
>>> "namedList" for example.
>>>
>>> Assigning names() to such a class would go through without warning as it
>>> does now.
>>>
>>> > getClass("namedList")
>>> Class "namedList" [package "methods"]
>>>
>>> Slots:
>>>
>>> Name: .Data names
>>> Class: list character
>>>
>>> Extends:
>>> Class "list", from data part
>>> Class "vector", by class "list", distance 2
>>>
>>> Known Subclasses: "listOfMethods"
>>> > xx <- new("namedList", list(a=1,b=2))
>>> > names(xx)
>>> [1] "a" "b"
>>> > names(xx) <- c("D", "E")
>>> > xx at names
>>> [1] "D" "E"
>>> >
>>>
>>> There was no question of breaking inheritance.
>>>
>>> On 5/16/11 4:13 PM, Hervé Pagès wrote:
>>>> On 11-05-16 01:53 PM, John Chambers wrote:
>>>>>
>>>>>
>>>>> On 5/16/11 10:09 AM, Hervé Pagès wrote:
>>>>>> On 11-05-16 09:36 AM, John Chambers wrote:
>>>>>>> You set up a names slot in a non-vector. Maybe that should be
>>>>>>> allowed,
>>>>>>> maybe not. But in any case I would not expect the names()
>>>>>>> primitive to
>>>>>>> find it, because your object has a non-vector type ("S4").
>>>>>>
>>>>>> But the names<-() primitive *does* find it. So either names() and
>>>>>> names<-() should both find it, or they shouldn't. I mean, if you care
>>>>>> about consistency and predictability of course.
>>>>>
>>>>> That's not the only case where borderline or mistaken behavior is
>>>>> caught
>>>>> on assignment, but not on access. The argument is that assignment can
>>>>> afford to check things, but access needs to be fast. Slot access is
>>>>> another case. There, assignment ensures legality so access can be
>>>>> quick.
>>>>>
>>>>> The catch is that there are sometimes backdoor ways to assignments,
>>>>> partly because slots, attributes and some "builtin" properties like
>>>>> names overlap.
>>>>>
>>>>> What we were talking about before was trying to evolve a sensible rule
>>>>> for assigning names to S4 objects. Let's try to discuss what people
>>>>> need
>>>>> to do before carping or indulging in sarcasm.
>>>>
>>>> What *you* were talking about but not what my original post was about.
>>>> Anyway, about the following proposal:
>>>>
>>>> 1. If the class has a vector data slot and no names slot, assign the
>>>> names but with a warning.
>>>>
>>>> 2. Otherwise, throw an error.
>>>>
>>>> (I.e., I would prefer an error throughout, but discretion ....)
>>>>
>>>> I personally don't like it because it breaks inheritance. Let's
>>>> say I have a class B with a vector data slot and no names slot.
>>>> According to 1. names<-() would work out-of-the-box on it (with
>>>> a warning), but now if I extend it by adding a names slot, it
>>>> breaks.
>>>>
>>>> One thing to consider though is that this works right now (and with
>>>> no warning):
>>>>
>>>> > setClass("I", contains="integer")
>>>> [1] "I"
>>>> > i <- new("I", 1:4)
>>>> > names(i) <- LETTERS[1:4]
>>>> > attributes(i)
>>>> $class
>>>> [1] "I"
>>>> attr(,"package")
>>>> [1] ".GlobalEnv"
>>>>
>>>> $names
>>>> [1] "A" "B" "C" "D"
>>>>
>>>> > names(i)
>>>> [1] "A" "B" "C" "D"
>>>>
>>>> and it's probably what most people would expect (sounds reasonable
>>>> after all). So this needs to keep working (with no warning). I can
>>>> see 2 ways to avoid breaking inheritance:
>>>>
>>>> (a) not allow a names slot to be added to class I or any
>>>> of its subclasses (in other words the .Data and names
>>>> slots cannot coexist),
>>>> or
>>>> (b) have names() and names<-() keep working when the names slot is
>>>> added but that is maybe dangerous as it might break C code that
>>>> is trying to access the names, that is, inheritance might break
>>>> but now at the C level
>>>>
>>>> Now for classes that don't have a .Data slot, they can of course
>>>> have a names slot. I don't have a strong opinion on whether names()
>>>> and names<-() should access it by default, but honestly that's really
>>>> a very small convenience offered to the developer of the class. Also,
>>>> for the sake of consistency, the same would need to be done for dim,
>>>> dimnames and built-in attributes in general. And also that won't work
>>>> if those built-in-attributes-made-slots are not declared with the right
>>>> type in the setClass statement (i.e. "character" for names, "integer"
>>>> for dim, etc...). And also by default names() would return character(0)
>>>> and not NULL. So in the end, potentially a lot of complications /
>>>> surprise / inconsistencies for very little value.
>>>>
>>>> Thanks,
>>>> H.
>>>>
>>>>>
>>>>> John
>>>>>
>>>>>>
>>>>>> H.
>>>>>>
>>>>>>
>>>>>>> You could do
>>>>>>> a at names if you thought that made sense:
>>>>>>>
>>>>>>>
>>>>>>> > setClass("A", representation(names="character"))
>>>>>>> [1] "A"
>>>>>>> > a <- new("A")
>>>>>>> > a at names <- "xx"
>>>>>>> > a at names
>>>>>>> [1] "xx"
>>>>>>> > names(a)
>>>>>>> NULL
>>>>>>>
>>>>>>>
>>>>>>> If you wanted something sensible, it's more like:
>>>>>>>
>>>>>>> > setClass("B", representation(names = "character"), contains =
>>>>>>> "integer")
>>>>>>> [1] "B"
>>>>>>> > b <- new("B", 1:5)
>>>>>>> > names(b) <- letters[1:5]
>>>>>>> > b
>>>>>>> An object of class "B"
>>>>>>> [1] 1 2 3 4 5
>>>>>>> Slot "names":
>>>>>>> [1] "a" "b" "c" "d" "e"
>>>>>>>
>>>>>>> > names(b)
>>>>>>> [1] "a" "b" "c" "d" "e"
>>>>>>>
>>>>>>> This allows both the S4 and the primitive code to deal with a
>>>>>>> well-defined object.
>>>>>>>
>>>>>>> John
>>>>>>>
>>>>>>>
>>>>>>> On 5/15/11 3:02 PM, Hervé Pagès wrote:
>>>>>>>> On 11-05-15 11:33 AM, John Chambers wrote:
>>>>>>>>> This is basically a case of a user error that is not being caught:
>>>>>>>>
>>>>>>>> Sure!
>>>>>>>>
>>>>>>>> https://stat.ethz.ch/pipermail/r-devel/2009-March/052386.html
>>>>>>> ......
>>>>>>>
>>>>>>>>
>>>>>>>> Ah, that's interesting. I didn't know I could put a names slot
>>>>>>>> in my
>>>>>>>> class. Last time I tried was at least 3 years ago and that was
>>>>>>>> causing
>>>>>>>> problems (don't remember the exact details) so I ended up using
>>>>>>>> NAMES
>>>>>>>> instead. Trying again with R-2.14:
>>>>>>>>
>>>>>>>> > setClass("A", representation(names="character"))
>>>>>>>>
>>>>>>>> > a <- new("A")
>>>>>>>>
>>>>>>>> > attributes(a)
>>>>>>>> $names
>>>>>>>> character(0)
>>>>>>>>
>>>>>>>> $class
>>>>>>>> [1] "A"
>>>>>>>> attr(,"package")
>>>>>>>> [1] ".GlobalEnv"
>>>>>>>>
>>>>>>>> > names(a)
>>>>>>>> NULL
>>>>>>>>
>>>>>>>> > names(a) <- "K"
>>>>>>>>
>>>>>>>> > attributes(a)
>>>>>>>> $names
>>>>>>>> [1] "K"
>>>>>>>>
>>>>>>>> $class
>>>>>>>> [1] "A"
>>>>>>>> attr(,"package")
>>>>>>>> [1] ".GlobalEnv"
>>>>>>>>
>>>>>>>> > names(a)
>>>>>>>> NULL
>>>>>>>>
>>>>>>>> Surprise! But that's another story...
>>>>>>>>
>>>>>>>>>
>>>>>>>>> The modification that would make sense would be to give you an
>>>>>>>>> error in
>>>>>>>>> the above code. Not a bad idea, but it's likely to generate more
>>>>>>>>> complaints in other contexts, particularly where people don't
>>>>>>>>> distinguish the "list" class from lists with names (the
>>>>>>>>> "namedList"
>>>>>>>>> class).
>>>>>>>>>
>>>>>>>>> A plausible strategy:
>>>>>>>>> 1. If the class has a vector data slot and no names slot, assign
>>>>>>>>> the
>>>>>>>>> names but with a warning.
>>>>>>>>>
>>>>>>>>> 2. Otherwise, throw an error.
>>>>>>>>>
>>>>>>>>> (I.e., I would prefer an error throughout, but discretion ....)
>>>>>>>>
>>>>>>>> Or, at a minimum (if no consensus can be reached about the above
>>>>>>>> strategy), not add a "names" attribute set to NULL. My original
>>>>>>>> post was more about keeping the internal representation of objects
>>>>>>>> "normalized", in general, so identical() is more likely to be
>>>>>>>> meaningful.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> H.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comments?
>>>>>>>>>
>>>>>>>>> John
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> H.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ______________________________________________
>>>>>>>>> R-devel at r-project.org mailing list
>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-devel mailing list