[R] date in plot, can't add regression line
Rui Barradas
ruipbarradas at sapo.pt
Tue Aug 28 20:14:52 CEST 2012
Hello,
Inline.
Em 28-08-2012 18:23, Nordlund, Dan (DSHS/RDA) escreveu:
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
>> project.org] On Behalf Of Norbert Skalski
>> Sent: Tuesday, August 28, 2012 9:49 AM
>> To: r-help at r-project.org
>> Subject: [R] date in plot, can't add regression line
>>
>> Hello all,
>>
>> I have been using R for about 3 weeks and I am frustrated by a problem.
>> I have read R in a nutshell, scoured the internet for help but I either
>> am not understanding examples or am missing something completely basic.
>> Here is the problem:
>>
>> I want to plot data that contains dates on the x axis. Then I want to
>> fit a line to the data. I have been unable to do it.
>>
>> This is an example of the data (in a dataframe called
>> "tradeflavorbyday"), 40 lines of it (I'm sorry it's not in a runnable
>> form, not sure how to get that from R) :
>> tradeflavor timestamp x
>> 1 1 2009-01-22 1
>> 2 2 2009-01-22 1
>> 3 1 2009-01-23 1
>> 4 1 2009-01-27 54
>> 5 1 2009-01-28 105
>> 6 2 2009-01-28 2
>> 7 16 2009-01-28 2
>> 8 1 2009-01-29 71
>> 9 16 2009-01-29 2
>> 10 1 2009-01-30 42
>> 11 1 2009-02-02 19
>> 12 16 2009-02-02 2
>> 13 1 2009-02-03 36
>> 14 4 2009-02-03 2
>> 15 8 2009-02-03 3
>> 16 1 2009-02-04 73
>> 17 8 2009-02-04 12
>> 18 16 2009-02-04 7
>> 19 1 2009-02-05 53
>> 20 8 2009-02-05 6
>> 21 16 2009-02-05 9
>> 22 1 2009-02-06 38
>> 23 4 2009-02-06 6
>> 24 8 2009-02-06 2
>> 25 16 2009-02-06 3
>> 26 1 2009-02-09 42
>> 27 2 2009-02-09 2
>> 28 4 2009-02-09 1
>> 29 8 2009-02-09 2
>> 30 1 2009-02-10 87
>> 31 4 2009-02-10 2
>> 32 8 2009-02-10 4
>> 33 16 2009-02-10 3
>> 34 1 2009-02-11 55
>> 35 2 2009-02-11 6
>> 36 4 2009-02-11 4
>> 37 8 2009-02-11 2
>> 38 16 2009-02-11 8
>> 39 1 2009-02-12 153
>> 40 2 2009-02-12 6
>>
>>
>> The plot displays the x column as the yaxis and the date as the x axis,
>> grouped by the tradetype column.
>> The timestamp column:
>>> class(tradeflavorbyday$timestamp)
>> [1] "POSIXlt" "POSIXt"
>>
>> So in this case I want to plot tradetype 1 (method 1):
>>
>> xdates <- tradeflavorbyday$timestamp[tradeflavorbyday$tradeflavor == 1]
>> ydata <- tradeflavorbyday$x[tradeflavorbyday$tradeflavor == 1]
>>
>> plot(xdates, ydata, col="black", xlab="Dates", ylab="Count")
>>
>> Up to here it works great.
>>
>> Now a abline through lm:
>>
>> xylm <- lm(ydata~xdates) <------ this fails, can't do dates as below
>> abline(xylm, col="black")
>>
>>> lm(ydata~xdates)
>> Error in model.frame.default(formula = ydata ~ xdates,
>> drop.unused.levels = TRUE) :
>> invalid type (list) for variable 'xdates'
>>
>>
> You might try converting timestamp as follows
>
> xdates <- as.POSIXct(tradeflavorbyday$timestamp[tradeflavorbyday$tradeflavor == 1])
>
> Your original code should now work.
It does, I've just tried it.
Also, regarding the op statement "(I'm sorry it's not in a runnable
form, not sure how to get that from R)":
# It's easy to read in the data
tfday <- read.table(text="
tradeflavor timestamp x
1 1 2009-01-22 1
2 2 2009-01-22 1
3 1 2009-01-23 1
[...etc...]
39 1 2009-02-12 153
40 2 2009-02-12 6
", header=TRUE, stringsAsFactors=FALSE)
# But it's better to paste the output of dput().
dput(tfday)
structure(list(tradeflavor = c(1L, 2L, 1L, 1L, 1L, 2L, 16L, 1L,
16L, 1L, 1L, 16L, 1L, 4L, 8L, 1L, 8L, 16L, 1L, 8L, 16L, 1L, 4L,
8L, 16L, 1L, 2L, 4L, 8L, 1L, 4L, 8L, 16L, 1L, 2L, 4L, 8L, 16L,
1L, 2L), timestamp = structure(c(1232582400, 1232582400, 1232668800,
1233014400, 1233100800, 1233100800, 1233100800, 1233187200, 1233187200,
1233273600, 1233532800, 1233532800, 1233619200, 1233619200, 1233619200,
1233705600, 1233705600, 1233705600, 1233792000, 1233792000, 1233792000,
1233878400, 1233878400, 1233878400, 1233878400, 1234137600, 1234137600,
1234137600, 1234137600, 1234224000, 1234224000, 1234224000, 1234224000,
1234310400, 1234310400, 1234310400, 1234310400, 1234310400, 1234396800,
1234396800), class = c("POSIXct", "POSIXt"), tzone = ""), x = c(1L,
1L, 1L, 54L, 105L, 2L, 2L, 71L, 2L, 42L, 19L, 2L, 36L, 2L, 3L,
73L, 12L, 7L, 53L, 6L, 9L, 38L, 6L, 2L, 3L, 42L, 2L, 1L, 2L,
87L, 2L, 4L, 3L, 55L, 6L, 4L, 2L, 8L, 153L, 6L)), .Names = c("tradeflavor",
"timestamp", "x"), row.names = c("1", "2", "3", "4", "5", "6",
"7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17",
"18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28",
"29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39",
"40"), class = "data.frame")
# Now all we need to do is copy and paste this into an R session:
tfday <- structure(...etc...)
# Finally, for the sake of completeness, the rest of the code.
tfday$timestamp <- as.POSIXct(tfday$timestamp)
inx <- tfday$tradeflavor == 1 # do this once
xdates <- tfday$timestamp[inx]
ydata <- tfday$x[inx]
plot(xdates, ydata)
model <- lm(ydata ~ xdates)
abline(model)
Hope this helps,
Rui Barradas
>
> Hope this is helpful,
>
> Dan
>
> Daniel J. Nordlund
> Washington State Department of Social and Health Services
> Planning, Performance, and Accountability
> Research and Data Analysis Division
> Olympia, WA 98504-5204
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list