[Rd] Surprising length() of POSIXlt vector (PR#14073)
maechler at stat.math.ethz.ch
maechler at stat.math.ethz.ch
Fri Nov 20 19:05:30 CET 2009
>>>>> "PD" == Peter Dalgaard <p.dalgaard at biostat.ku.dk>
>>>>> on Fri, 20 Nov 2009 09:54:34 +0100 writes:
PD> mark at celos.net wrote:
>> Arrays of POSIXlt dates always return a length of 9. This
>> is correct (they're really lists of vectors of seconds,
>> hours, and so forth), but other methods disguise them as
>> flat vectors, giving superficially surprising behaviour:
>>
>> strings <- paste('2009-1-', 1:31, sep='')
>> dates <- strptime(strings, format="%Y-%m-%d")
>>
>> print(dates)
>> # [1] "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" "2009-01-05"
>> # [6] "2009-01-06" "2009-01-07" "2009-01-08" "2009-01-09" "2009-01-10"
>> # [11] "2009-01-11" "2009-01-12" "2009-01-13" "2009-01-14" "2009-01-15"
>> # [16] "2009-01-16" "2009-01-17" "2009-01-18" "2009-01-19" "2009-01-20"
>> # [21] "2009-01-21" "2009-01-22" "2009-01-23" "2009-01-24" "2009-01-25"
>> # [26] "2009-01-26" "2009-01-27" "2009-01-28" "2009-01-29" "2009-01-30"
>> # [31] "2009-01-31"
>>
>> print(length(dates))
>> # [1] 9
>>
>> str(dates)
>> # POSIXlt[1:9], format: "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" ...
>>
>> print(dates[20])
>> # [1] "2009-01-20"
>>
>> print(length(dates[20]))
>> # [1] 9
>>
>> I've since realised that POSIXct makes date vectors easier,
>> but could we also have something like:
>>
>> length.POSIXlt <- function(x) { length(x$sec) }
>>
>> in datetime.R, to avoid breaking functions (like the
>> str.POSIXt method) which use length() in this way?
PD> [You need "wishlist" in the title for this sort of stuff.]
PD> I'd be wary of this. Just the other day we found that identical() broke
PD> on some objects because a package had length() redefined as a class
PD> method. I.e. the danger is that something wants to use length() with its
PD> original low-level interpretation.
Yes, of course.
and Romain mentioned str(). Note that we have needed to define
a "POSIXt" method for str(), partly just *because* of the
current anomaly:
As Tony Plate, e.g., has argued, entirely correctly in my view,
the anomaly is that length() and "[" are not compatible;
and while I think no R language definition says that they should
be, I still believe that you need very good reasons for them to
be incompatible, as they are for POSIXlt.
In the current case, for me the only good reason is backwards
compatibility.
My personal taste would be to change it and see what happens.
I would be willing to clean up after that change within R 'base'
and all packages I am coauthoring (quite a few), but of course
there are still a thousand more R packages..
My strong bet would be that less than 1% would be affected,
and my point guess for the percentage affected would be
rather in the order of 1/1000.
The question is if we (you too!), the R community, are willing to
bear the load of cleanup, after such a change which would really
*improve* consistency of that small corner of R.
For me, as I indicated above, I am willing to bear my share
(and actually have got it ready for R-devel)
Martin Maechler, ETH Zurich (and R Core Team)
More information about the R-devel
mailing list