[R] Extracting part of date variable

Peter Dalgaard P.Dalgaard at biostat.ku.dk
Thu Feb 1 12:48:15 CET 2007


stat stat wrote:
> Dear all,
>    
>   Suppose I have a date variable:
>    
>   c = "99/05/12"
>    
>   I want to extract the parts of this date like month number, year and day. I can do it in SPSS. Is it possible to do this in R as well?
>    
>   Rgd,
>
>   
Yes. One way is to use substr(), e.g.:

> substr(c,1,2)
[1] "99"
> as.numeric(substr(c,1,2))
[1] 99

This also nicely sidesteps the ambiguity issue: 1999 or 1899? May or
December? On the other hand, you'll get in trouble if leading zeros are
sometimes absent (strsplit() or gsub() if you want to pursue that route
further).

For a more principled approach, use the time and date handling tools.
Assuming that you can live with the system defaults for 2-digit years,

> strptime(c,format="%y/%m/%d")
[1] "1999-05-12"
> strptime(c,format="%y/%m/%d")$year
[1] 99
> strptime(c,format="%y/%m/%d")$mon
[1] 4
> strptime(c,format="%y/%m/%d")$mday
[1] 12

Beware the peculiarities of the entries defined by POSIX standard, see
?DateTimeClasses, and also:

     '%y' Year without century (00-99). If you use this on input, which
          century you get is system-specific.  So don't!  Often values
          up to 69 (or 68) are prefixed by 20 and 70(or 69) to 99 by
          19.

(I'm at a bit of a loss as to fixing up two digit years once the damage
has been done. Presumably, you can just diddle the year field, but I'm a
bit uneasy about the fact that  2000 was a leap year and 1900 was not.)



-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907



More information about the R-help mailing list