[Rd] Why is strptime always returning a vector of length 9 ?

Martin Maechler maechler at stat.math.ethz.ch
Mon Aug 10 08:55:52 CEST 2009


>>>>> "l" == laurent  <lgautier at gmail.com>
>>>>>     on Sun, 09 Aug 2009 21:45:07 +0200 writes:

    l> Thanks.  It seems that the source of my confusion comes
    l> from using first using str() (and then once on the wrong
    l> track, it is easier to miss the information a man page
    l> that also describes POSIXct that is itself a vector of
    l> length equal to the number of entries it contains).

    l> With the current example:
    >> str(xd)
    l>  POSIXlt[1:9], format: "2007-03-09" "2007-05-31"
    l> "2008-11-12" "2008-11-12" ...

    l> A quick inspection of the output does indicate a
    l> something with nine elements, but the elements appear to
    l> be "2007-03-09", "2007-05-31", etc... possibly creating
    l> confusion.

    l> To make it even more confusing I have:
    >> x[1]
    l> [1] "March 09, 2007"
    >> str(x[1])
    l>  chr "March 09, 2007"

    l> For what it is worth, I think that the behavior of the
    l> extract operator "[" (defined as a S3 method
    l> "[.POSIXlt()") is inconsistent with the output of
    l> length() (default method for lists).

Yes, exactly;  these two have beeb defined inconsistently,
exactly in the respect you mention.

Many months ago, when I came to the same conclusion, 
I vaguely remember that I had wanted / proposed to change this
(namely, change [.POSIXlt, the  length() method for  "POSIXlt") ,
but IIRC had been vetoed by others.
... another frustrating experience for me ...

Martin Maechler, ETH Zurich



    l> On Sun, 2009-08-09 at 11:45 -0500, Jeff Ryan wrote:
    >> The reason is in the ?strptime under value:
    >> 
    >> 'strptime' turns character representations into an object of class
    >> '"POSIXlt"'.  The timezone is used to set the 'isdst' component
    >> and to set the '"tzone"' attribute if 'tz != ""'.
    >> 
    >> And POSIXlt is a list of length 9.
    >> 
    >> 
    >> HTH
    >> Jeff
    >> 
    >> On Sun, Aug 9, 2009 at 10:35 AM, Gabor
    >> Grothendieck<ggrothendieck at gmail.com> wrote:
    >> > Try this to see its components:
    >> >
    >> >> str(unclass(xd))
    >> > List of 9
    >> >  $ sec  : num [1:6] 0 0 0 0 0 0
    >> >  $ min  : int [1:6] 0 0 0 0 0 0
    >> >  $ hour : int [1:6] 0 0 0 0 0 0
    >> >  $ mday : int [1:6] 9 31 12 12 30 30
    >> >  $ mon  : int [1:6] 2 4 10 10 6 6
    >> >  $ year : int [1:6] 107 107 108 108 109 109
    >> >  $ wday : int [1:6] 5 4 3 3 4 4
    >> >  $ yday : int [1:6] 67 150 316 316 210 210
    >> >  $ isdst: int [1:6] 0 1 0 0 1 1
    >> >
    >> > and read R News 4/1 for more.
    >> >
    >> > On Sun, Aug 9, 2009 at 10:20 AM, laurent<lgautier at gmail.com> wrote:
    >> >> Dear List,
    >> >>
    >> >>
    >> >> I am having an issue with strptime (see below).
    >> >> I can reproduce it on R-2.8, R-2.9, and R-2.10-dev, I tempted to see
    >> >> either a bug or my misunderstanding (and then I just don't currently see
    >> >> where).
    >> >>
    >> >> # setup:
    >> >> x <- c("March 09, 2007", "May 31, 2007", "November 12, 2008", "November
    >> >> 12, 2008", "July 30, 2009", "July 30, 2009" )
    >> >>
    >> >> # showing the problem
    >> >>> length(x)
    >> >> 6
    >> >>> xd <- strptime(x, format = "%B %d, %Y")
    >> >>> length(xd)
    >> >> 9
    >> >>> xd[1:9]
    >> >> [1] "2007-03-09" "2007-05-31" "2008-11-12" "2008-11-12" "2009-07-30"
    >> >> [6] "2009-07-30" NA           NA           NA
    >> >>> length(strptime(rep(x, 2), format="%B %d, %Y"))
    >> >> [1] 9
    >> >>> strptime(rep(x, 2), format="%B %d, %Y")[1:12]
    >> >>  [1] "2007-03-09" "2007-05-31" "2008-11-12" "2008-11-12" "2009-07-30"
    >> >>  [6] "2009-07-30" "2007-03-09" "2007-05-31" "2008-11-12" "2008-11-12"
    >> >> [11] "2009-07-30" "2009-07-30
    >> >>
    >> >> Any pointer would be appreciated.
    >> >>
    >> >> L.



More information about the R-devel mailing list