[Rd] bug in partial matching of attribute names
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Feb 14 07:16:10 CET 2007
On Tue, 13 Feb 2007, Tony Plate wrote:
> Ok, thanks for the news of a fix in progress.
BTW, your suggested fix is incorrect. Consider having an exact match
after two partial matches in the list of attributes.
> On the topic of the "names" attribute being treated specially, I wonder if
> the do_attr() function might treat it a little too specially. As far as I
> can tell, the loop in the first large block code in do_attr() (attrib.c),
> which begins
>
> /* try to find a match among the attributes list */
> for (alist = ATTRIB(s); alist != R_NilValue; alist = CDR(alist)) {
>
> will find a full or partial match for a "names" attribute (at least for
> ordinary lists and vectors).
>
> Then the large block of code after that, beginning:
>
> /* unless a full match has been found, check for a "names" attribute */
> if (match != FULL && ! strncmp(CHAR(PRINTNAME(R_NamesSymbol)), str, n))
> {
>
> seems unnecessary because a names attribute has already been checked for.
> In the case of a partial match on the "names" attribute this code will
> behave as though there is an ambiguous partial match, and (incorrectly)
> return Nil. Is this second block of code specific to the "names" attribute
> possibly a hangover from an earlier day when the first loop didn't detect a
> "names" attribute? Or am I missing something? Are there some other objects
> for which the first loop doesn't include a "names" attribute?
Yes: I pointed you at the 'R internals' manual, but this is also on the
help page. 1D arrays and pairlists have names stored elsewhere. It needs
to be changed to be
- else if (match == PARTIAL) {
+ else if (match == PARTIAL && strcmp(CHAR(PRINTNAME(tag)), "names")) {
> -- Tony Plate
>
> Prof Brian Ripley wrote:
>> It happens that I was looking at this yesterday (in connection with
>> encodings on CHARSXPs) and have a fix in testing across CRAN right now.
>>
>> As for "names", as you will know from reading 'R Internals' the names can
>> be stored in more than one place, which is why it has to be treated
>> specially.
>>
>> On Mon, 12 Feb 2007, Tony Plate wrote:
>>
>>> There looks to be a bug in do_attr() (src/main/attrib.c): incorrect
>>> partial matches of attribute names can be returned when there are an odd
>>> number of partial matches.
>>>
>>> E.g.:
>>>
>>> > x <- c(a=1,b=2)
>>> > attr(x, "abcdef") <- 99
>>> > attr(x, "ab")
>>> [1] 99
>>> > attr(x, "abc") <- 100
>>> > attr(x, "ab") # correctly returns NULL because of ambig partial match
>>> NULL
>>> > attr(x, "abcd") <- 101
>>> > attr(x, "ab") # incorrectly returns non-NULL for ambig partial match
>>> [1] 101
>>> > names(attributes(x))
>>> [1] "names" "abcdef" "abc" "abcd"
>>> >
>>>
>>> The problem in do_attr() looks to be that after match is set to
>>> PARTIAL2, it can be set back to PARTIAL again. I think a simple fix is
>>> to add a "break" in this block in do_attr():
>>>
>>> else if (match == PARTIAL) {
>>> /* this match is partial and we already have a partial match,
>>> so the query is ambiguous and we return R_NilValue */
>>> match = PARTIAL2;
>>> break; /* <---- ADD BREAK HERE */
>>> } else {
>>>
>>> However, if this is indeed a bug, would this be a good opportunity to
>>> get rid of partial matching on attribute names -- it was broken anyway
>>> -- so toss it out? :-) Does anyone depend on partial matching for
>>> attribute names? My view is that it's one of those things like partial
>>> matching of list and vector element names that seemed like a good idea
>>> at first, but turns out to be more trouble than it's worth.
>>>
>>> On a related topic, partial matching does not seem to work for the
>>> "names" attribute (which I would regard as a good thing :-). However,
>>> I'm puzzled why it doesn't work, because the code in do_attr() seems to
>>> try hard to make it work. Can anybody explain why?
>>>
>>> E.g.:
>>> > attr(x, "names")
>>> [1] "a" "b"
>>> > attr(x, "nam")
>>> NULL
>>>
>>> > sessionInfo()
>>> R version 2.4.1 (2006-12-18)
>>> i386-pc-mingw32
>>>
>>> locale:
>>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>>> States.1252;LC_MONETARY=English_United
>>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>>
>>> attached base packages:
>>> [1] "stats" "graphics" "grDevices" "utils" "datasets" "methods"
>>> [7] "base"
>>> >
>>>
>>> -- Tony Plate
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list