[R] Strptime/ date time classes
Caroline Keef
caroline.keef at jbaconsulting.co.uk
Wed Jul 9 19:18:55 CEST 2008
Thank you, but why does this happen?
a =(1:223960)[is.na(datetimes)]
datetimes[a]
> [1] "1981-03-29 01:20:00" "1990-03-25 01:43:00" "1992-03-29 01:43:00"
> "1996-03-31 01:30:00" "1996-03-31 01:57:00" [6] "1997-03-30 01:02:00"
> "1997-03-30 01:14:00" "1997-03-30 01:27:00" "1997-03-30 01:44:00"
> "1997-03-30 01:55:00" [11] "1998-03-29 01:16:00" "1998-03-29 01:41:00"
> "1998-03-29 01:56:00" "1999-03-28 01:03:00" "1999-03-28 01:18:00" [16]
> "2000-03-26 01:28:00"
Which obviously aren't missing.
I do want POSIXlt as I need to extract the day of the month (I'm
extracting daily maxima from irregulrly observed time series).
This seems like a bug to me, I just thought I'd check with people who
know more than I do.
Caroline
-----Original Message-----
From: jim holtman [mailto:jholtman at gmail.com]
Sent: 09 July 2008 17:24
To: Caroline Keef
Cc: r-help at r-project.org
Subject: Re: [R] Strptime/ date time classes
You probably want POSIXct instead of POSIXlt:
x <-
read.table(textConnection("#TZUTC+0|*|SANR08002|*|SNAMENAUL|*|SWATERDELV
IN|*|CNR98808|*|
+ #CNAMEQ|*|CTYPEn-min-ip|*|CMW1440|*|RTIMELVLhigh-resolution|*|
+ #CUNITm3/s|*|RINVAL-777|*|RNR-1|*|REXCHANGE98913|*|
+ #RTYPEinstantaneous values|*|
+ 19800604062759 -777.0
+ 19800604062800 0.271
+ 19800604111900 0.286
+ 19800604134300 0.362
+ 19800604144400 0.465
+ 19800604163300 0.510
+ 19800604175400 0.518
+ 19800604185100 0.526
+ 19800611110900 -777.0
+ 19800611110959 -777.0
+ 19800611111000 0.100
+ 19800611211400 0.096
+ 19800612000000 0.096
+ 19800612065000 0.098
+ 19800612133400 0.100"),colClasses=c('character','numeric'))
> closeAllConnections()
> # you probably want POSIXct not POSIXlt
> datetimes <- as.POSIXct(strptime(x[,1], "%Y%m%d%H%M%S"))
> str(datetimes)
POSIXct[1:15], format: "1980-06-04 06:27:59" "1980-06-04 06:28:00"
"1980-06-04 11:19:00" ...
> length(datetimes)
[1] 15
>
On Wed, Jul 9, 2008 at 6:09 AM, Caroline Keef
<caroline.keef at jbaconsulting.co.uk> wrote:
> Dear all,
>
> I've come across a problem using strptime, can anyone explain what's
> going on? I'm using version 2.7.0 on Windows XP.
>
> Thank you
>
> Caroline
>
> First read in a data file using read.table
>
> alldata = read.table(file, header=F, skip=4, colClasses =
> c("character","numeric"))
>
> dim(alldata)
> [1] 223960 2
>
> # inefficient, safe way of sorting out missing or dodgy data
>
> alldata[,2][alldata[,2] < 0] = NA
>
> # first ten lines of the data
>
> alldata[1:10,]
> V1 V2
> 1 19800604062759 NA
> 2 19800604062800 0.271
> 3 19800604111900 0.286
> 4 19800604134300 0.362
> 5 19800604144400 0.465
> 6 19800604163300 0.510
> 7 19800604175400 0.518
> 8 19800604185100 0.526
> 9 19800611110900 NA
> 10 19800611110959 NA
>
> #Then convert the first column using strptime
>
> datetimes = strptime(alldata[,1],format="%Y%m%d%H%M%S")
>
> #Then I want to get minimum and maximum, but some seem to be missing
> when they aren't.
>
> length(as.POSIXlt(datetimes)) #also equal to length(datetimes)
>
> [1] 9
>
> # Why isn't this 223960? Is it something to do with the class?
>
> # This is the really puzzling bit (to me anyway)
>
> a =(1:223960)[is.na(datetimes)]
>
> # which gives
> 1462 14295 18744 50499 50500 92472 92473 92474 92475 92476
> 137525 137526 137527 171066 171067 192353
>
> # 16 values
>
> alldata[a,]
> V1 V2
> 1462 19810329012000 0.983
> 14295 19900325014300 0.219
> 18744 19920329014300 0.246
> 50499 19960331013000 0.564
> 50500 19960331015700 0.563
> 92472 19970330010200 0.173
> 92473 19970330011400 0.172
> 92474 19970330012700 0.172
> 92475 19970330014400 0.172
> 92476 19970330015500 0.172
> 137525 19980329011600 0.427
> 137526 19980329014100 0.427
> 137527 19980329015600 0.427
> 171066 19990328010300 0.223
> 171067 19990328011800 0.223
> 192353 20000326012800 0.189
>
> datetimes[a]
> [1] "1981-03-29 01:20:00" "1990-03-25 01:43:00" "1992-03-29 01:43:00"
> "1996-03-31 01:30:00" "1996-03-31 01:57:00" [6] "1997-03-30 01:02:00"
> "1997-03-30 01:14:00" "1997-03-30 01:27:00" "1997-03-30 01:44:00"
> "1997-03-30 01:55:00" [11] "1998-03-29 01:16:00" "1998-03-29 01:41:00"
> "1998-03-29 01:56:00" "1999-03-28 01:03:00" "1999-03-28 01:18:00" [16]
> "2000-03-26 01:28:00"
>
> # They're all around the end of March! I've looked at the data file
> and I can't see anything funny in it around these dates.
>
>
>
> The first few lines of the data file look like
>
> #TZUTC+0|*|SANR08002|*|SNAMENAUL|*|SWATERDELVIN|*|CNR98808|*|
> #CNAMEQ|*|CTYPEn-min-ip|*|CMW1440|*|RTIMELVLhigh-resolution|*|
> #CUNITm3/s|*|RINVAL-777|*|RNR-1|*|REXCHANGE98913|*|
> #RTYPEinstantaneous values|*|
> 19800604062759 -777.0
> 19800604062800 0.271
> 19800604111900 0.286
> 19800604134300 0.362
> 19800604144400 0.465
> 19800604163300 0.510
> 19800604175400 0.518
> 19800604185100 0.526
> 19800611110900 -777.0
> 19800611110959 -777.0
> 19800611111000 0.100
> 19800611211400 0.096
> 19800612000000 0.096
> 19800612065000 0.098
> 19800612133400 0.100
>
>
>
>
>
> Caroline KeefJBA Consulting
> South Barn, Broughton Hall, Skipton, North Yorkshire, BD23 3AE, UK
> t: +44 (0)1756 799919 f: +44 (0)1756 799449
>
> JBA Consulting now incorporates Maslen Environmental, the award
> winning environmental regeneration consultancy.
> http://www.maslen-environmental.com.
>
> JBA is a Carbon Neutral Company. Please don't print this e-mail unless
> you really need to.
>
> This email is covered by JBA Consulting's email disclaimer at
> www.jbaconsulting.co.uk/emaildisclaimer.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem you are trying to solve?
More information about the R-help
mailing list