[R] Confusion with Converting Factors to Dates using as.date
Josip Dasovic
jjd9 at sfu.ca
Wed Dec 10 21:41:00 CET 2008
Dear R-Helpers:
I'm having a problem getting dates into the correct format. I have a data frame, which is based on a .csv file that I imported into R via read.table.
R has converted my date variables to factors; when I use the as.Date command, most of the values are converted "correctly" (and by this I guess I mean converted "as I wish them to be") but some have not been.
Here's what I have:
str(pk.df)
'data.frame': 206 obs. of 134 variables:
$ uniqid : int 010 015 120 130 210 245 320 330 415 ...
$ st_date : Factor w/ 154 levels "01/01/48","01/01/51",..: 46 27 NA 12 118 NA 63 127 NA NA ...
...
I then convert them to a date class using
st_date.new<-as.Date(st_date, "%m/%d/%y")
This _seems_ to work...
str(st_date.new)
Class 'Date' num [1:206] 8150 8466 NA 33982 10149 ...
But notice the 4th observation; I would like it to be 1963, not 2063.
st_date.new[1:10]
[1] "1992-04-25" "1993-03-07" NA "2063-01-15" "1997-10-15"
[6] NA "1991-05-31" "1994-11-20" NA NA
st_date[1:10]
[1] 04/25/92 03/07/93 <NA> 01/15/63 10/15/97 <NA> 05/31/91
[8] 11/20/94 <NA> <NA>
154 Levels: 01/01/48 01/01/51 01/01/52 01/01/59 01/01/63 ... 12/31/96
I thought that the problem might be that I was converting a factor, so I first converted the variable to a character type (although I understand that this is done automatically) and then to date class, but I still had the same problem. Does anybody know how I can solve this and why I am getting this behavior? One more tidbit: the earliest date for which the date conversion is "correct" is 1969-04-15, while the most recent date for which the century is "incorrect" is 1967-11-05.
Thanks,
Josip
Research Associate
Human Security Report Project
School for International Studies
Simon Fraser University
Suite 7200--515 W. Hastings St.
Vancouver, BC V6B 5K3 Canada
More information about the R-help
mailing list