[Rd] [R] rownames, colnames, and date and time
Prof Brian Ripley
ripley at stats.ox.ac.uk
Thu Mar 30 14:41:40 CEST 2006
On Thu, 30 Mar 2006, Patrick Burns wrote:
> I haven't been following all of this thread, but
> it reminds me of a bug that was in S-PLUS not
> too long ago where dimnames could sometimes
> be numeric. This caused some problems that
> were very hard to track down because there were
> no visual clues of what was really wrong.
>
> I've been pleased not to encounter that in R and
> hope it continues.
Yes, the C-level assignment code ensures that what is assigned is
character or NULL (even if you set the attribute rather than use
dimnames<-).
>
> Patrick Burns
> patrick at burns-stat.com
> +44 (0)20 8525 0696
> http://www.burns-stat.com
> (home of S Poetry and "A Guide for the Unwilling S User")
>
> Prof Brian Ripley wrote:
>
>> Looking at the code it occurs to me that there is another case you have
>> not considered, namely dimnames().
>>
>> rownames<- and colnames<- are just wrappers for dimnames<-, so consistency
>> does mean that all three should behave the same.
>>
>> For arrays (including matrices), dimnames<- is primitive. It coerces
>> factors to character, and says in the C code
>>
>> /* if (isObject(val1)) dispatch on as.character.foo, but we don't
>> have the context at this point to do so */
>>
>> so someone considered this before now.
>>
>> For data frames, dimnames<-.data.frame is used. That calls row.names<-
>> and names<-, and the first has a data.frame method. Only the row.names<-
>> method is documented to coerce its value to character, and I think it _is_
>> all quite consistent. The basic rule is that all these functions coerce
>> for data frames, and none do for arrays.
>>
>> However, there was a problematic assumption in the row.names<-.data.frame
>> and dimnames<-.data.frame methods, which tested the length of 'value'
>> before coercion. That sounds reasonable, but in unusual cases such as
>> POSIXlt, coercion changes the length, and I have swapped the lines around.
>>
>> What you expected was that dimnames<-() would coerce to character,
>> although I can find no support for that expectation in the documentation.
>> If it were not a primitive function that would be easy to achieve, but as
>> it is, it would need an expert in the internal code to change. There is
>> also the risk of inconsistency, since as the comment says, the C code is
>> used in places where the context is not known. I think this is probably
>> best left alone.
>>
>>
>> On Wed, 29 Mar 2006, Prof Brian Ripley wrote:
>>
>>
>>
>>> Yet again, this is the wrong list for suggesting changes to R. Please do use
>>> R-devel for that purpose (and I have moved this).
>>>
>>> If this bothers you (it all works as documented, so why not use it as
>>> documented?), please supply a suitable patch to the current R-devel sources
>>> and it will be considered.
>>>
>>> And BTW, row.names is the canonical accessor function for data frames,
>>> and its 'value' argument is documented differently from that for rownames for
>>> an array. Cf:
>>>
>>> Details:
>>>
>>> The extractor functions try to do something sensible for any
>>> matrix-like object 'x'. If the object has 'dimnames' the first
>>> component is used as the row names, and the second component (if
>>> any) is used for the col names. For a data frame, 'rownames' and
>>> 'colnames' are equivalent to 'row.names' and 'names' respectively.
>>>
>>> Note:
>>>
>>> 'row.names' is similar to 'rownames' for arrays, and it has a
>>> method that calls 'rownames' for an array argument.
>>>
>>> I am not sure why R decided to add rownames for the same purpose as
>>> row.names: eventually they were made equivalent.
>>>
>>>
>>> On Tue, 21 Mar 2006, Erich Neuwirth wrote:
>>>
>>>
>>>
>>>> I noticed something surprising (in R 2.2.1 on WinXP)
>>>> According to the documentation, rownames and colnames are character
>>>> vectors.
>>>> Assigning a vector of class POSIXct or POSIXlt as rownames or colnames
>>>> therefore is not strictly according to the rules.
>>>> In some cases, R performs a reasonable typecast, but in some other cases
>>>> where the same typecast also would be possible, it does not.
>>>>
>>>> Assigning a vector of class POSIXct to the rownames or names of a
>>>> dataframe creates a reasonable string representation of the dates (and
>>>> possibly times).
>>>> Assigning such a vector to the rownames or colnames of a matrix produces
>>>> rownames or colnames consisting of the integer representation of the
>>>> date-time value.
>>>> Trying to assign a vector of class POSIXlt in all cases
>>>> (dataframes and matrices, rownames, colnames, names)
>>>> produces an error.
>>>>
>>>> Demonstration code is given below.
>>>>
>>>> This is somewhat inconsistent.
>>>> Perhaps a reasonable solution could be that the typecast
>>>> used for POSIXct and dataframes is used in all the other cases also.
>>>>
>>>> Code:
>>>>
>>>> mymat<-matrix(1:4,nrow=2,ncol=2)
>>>> mydf<-data.frame(mymat)
>>>> mydates<-as.POSIXct(c("2001-1-24","2005-12-25"))
>>>>
>>>> rownames(mydf)<-mydates
>>>> names(mydf)<-mydates
>>>> rownames(mymat)<-mydates
>>>> colnames(mymat)<-mydates
>>>>
>>>> print(deparse(mydates))
>>>> print(deparse(rownames(mydf)))
>>>> print(deparse(names(mydf)))
>>>> print(deparse(rownames(mymat)))
>>>> print(deparse(colnames(mymat)))
>>>>
>>>> mydates1<-as.POSIXlt(mydates)
>>>>
>>>> # the following lines will not work and
>>>> # produce errors
>>>>
>>>> rownames(mydf)<-mydates1
>>>> names(mydf)<-mydates1
>>>> rownames(mymat)<-mydates1
>>>> colnames(mymat)<-mydates1
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list