[R] Time value not sorting properly
Joshua Wiley
jwiley.psych at gmail.com
Fri Jul 9 02:49:41 CEST 2010
Jared,
I am not sure how you converted your 'time' variable from a factor to
numeric, but you probably actually want to convert it to one of the
'time' classes. To learn more about them in R, see ?DateTimeClasses
Another nice feature of these special time classes is that they can
handle year, month, day, and time all in one column. This means you
only need to sort by two columns (ID and time). You can also look at
?strptime for details on converting character strings into time
variables. An example using your data follows below.
Best regards,
Josh
samp.dat <- structure(list(ID = c(2836L, 2836L, 2836L, 2836L, 2836L, 2836L,
2836L, 2836L, 2836L, 2836L, 2836L, 2836L, 2836L, 2836L, 2836L,
2836L), year = c(2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L,
2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L, 2010L
), month = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 7L, 7L, 7L), day = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L), time = structure(c(12L, 13L, 14L,
15L, 16L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 1L), .Label = c("0:01:35",
"10:00:15", "11:00:44", "12:00:17", "13:00:38", "14:00:25", "15:00:53",
"16:00:11", "17:00:23", "18:00:47", "21:01:13", "3:00:50", "6:00:20",
"7:00:42", "8:00:42", "9:00:12"), class = "factor"), Lat = c(-1.2402597,
-1.2397508, -1.2431248, -1.2396636, -1.2304111, -1.2255532, -1.2248113,
-1.2251362, -1.2246384, -1.2245949, -1.2269631, -1.2264911, -1.2251153,
-1.2315372, -1.2578944, -1.242075), Long = c(35.5405911, 35.5406318,
35.5388285, 35.5285848, 35.5139149, 35.5162895, 35.5147305, 35.491731,
35.4918846, 35.4918647, 35.4880909, 35.4837137, 35.4817967, 35.4806165,
35.4670629, 35.5449559), test = c(77L, 120L, 214L, 300L, 345L,
436L, 528L, 585L, 665L, 727L, 813L, 846L, 928L, 1027L, 1093L,
1132L)), .Names = c("ID", "year", "month", "day", "time", "Lat",
"Long", "test"), class = "data.frame", row.names = c(NA, -16L
))
str(samp.dat)
#first combine all time columns using paste()
#then convert to POSIXlt
samp.dat$time2 <- strptime(x = paste(samp.dat$year, "-",
samp.dat$month, "-",
samp.dat$day, " ",
samp.dat$time,
sep=""),
format = "%Y-%m-%d %H:%M:%S")
str(samp.dat) #note how 'time2' is actually a time class now
#ordering becomes easier
temp.or <- order(samp.dat$ID, samp.dat$time2, decreasing=FALSE)
samp.dat <- samp.dat[temp.or, ]
samp.dat #print to screen
On Thu, Jul 8, 2010 at 4:28 PM, Jared Stabach
<jstabach at rams.colostate.edu> wrote:
> I have a dataframe of animal locations that I need to have in incremental
> order so that I can calculate the distance traveled between each time step.
> However, I have identified a few values that don't seem to sort properly.
> For instance, the last value in the table below should be the first value
> after sorting, since its time value is '00:01:35'. But, for some reason, it
> seems to be recognized after the '21:01:13' value. I also defined the time
> column as a numeric value (originally a factor) with the result shown in the
> 'test' column. As the value is reported as '1132', it seems there is an
> issue with the time value listed.
>
> ID year month day time Lat
> Long test
> 2836 2010 7 1 03:00:50 -1.2402597 35.5405911 77
> 2836 2010 7 1 06:00:20 -1.2397508 35.5406318 120
> 2836 2010 7 1 07:00:42 -1.2431248 35.5388285 214
> 2836 2010 7 1 08:00:42 -1.2396636 35.5285848 300
> 2836 2010 7 1 09:00:12 -1.2304111 35.5139149 345
> 2836 2010 7 1 10:00:15 -1.2255532 35.5162895 436
> 2836 2010 7 1 11:00:44 -1.2248113 35.5147305 528
> 2836 2010 7 1 12:00:17 -1.2251362 35.4917310 585
> 2836 2010 7 1 13:00:38 -1.2246384 35.4918846 665
> 2836 2010 7 1 14:00:25 -1.2245949 35.4918647 727
> 2836 2010 7 1 15:00:53 -1.2269631 35.4880909 813
> 2836 2010 7 1 16:00:11 -1.2264911 35.4837137 846
> 2836 2010 7 1 17:00:23 -1.2251153 35.4817967 928
> 2836 2010 7 1 18:00:47 -1.2315372 35.4806165 1027
> 2836 2010 7 1 21:01:13 -1.2578944 35.4670629 1093
> 2836 2010 7 1 00:01:35 -1.2420750 35.5449559 1132
>
> The code I used to sort the dataframe is:
>
> # Sort dataset so values are in incremental order
> temp.or
> <-order(wildebeest$ID,wildebeest$year,wildebeest$month,wildebeest$day,wildebeest$time,decreasing=FALSE)
> wildebeest <-wildebeest[temp.or,]
> Eventually, I will have around 400,000 records, so my script is designed at
> problem solving these errors. Is there something that I am missing or is
> there something in this field that could possibly be hidden? Any
> suggestions?
>
> Thanks in advance for any help.
>
> Jared
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/
More information about the R-help
mailing list