[R] Time series and stl in R: Error only univariate series are allowed
rodri178
rodri178 at mail.chapman.edu
Thu May 10 04:47:59 CEST 2012
I am doing analysis on hourly precipitation on a file that is disorganized.
However, I managed to clean it up and store it in a dataframe (called CA1)
which takes the form as followed:
Station_ID Guage_Type Lat Long Date Time_Zone Time_Frame H0
H1 H2 H3 H4 H5 H6 H7 H8 H9 H10 H11
H12 H13 H14 H15 H16 H17 H18 H19 H20 H21 H22 H23
1 4457700 HI 41.52 124.03 1948-07-01 8 LST 0
0 0 0 0 0 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
0 0 0 0 0 0 0 0 0 0 0 0
2 4457700 HI 41.52 124.03 1948-07-05 8 LST 0
1 1 1 1 1 2.0000000 2.0000000 2.0000000 4.0000000 5.0000000 5.0000000
4 7 1 1 0 0 10 13 5 1 1 3
3 4457700 HI 41.52 124.03 1948-07-06 8 LST 1
1 1 0 1 1 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
0 0 0 0 0 0 0 0 0 0 0 0
4 4457700 HI 41.52 124.03 1948-07-27 8 LST 3
0 0 0 0 0 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
0 0 0 0 0 0 0 0 0 0 0 0
5 4457700 HI 41.52 124.03 1948-08-01 8 LST 0
0 0 0 0 0 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
0 0 0 0 0 0 0 0 0 0 0 0
6 4457700 HI 41.52 124.03 1948-08-17 8 LST 0
0 0 0 0 0 0.3888889 0.3888889 0.3888889 0.3888889 0.3888889 0.3888889
6 1 0 0 0 0 0 0 0 0 0 0
Where H0 through H23 represent the 24 hours per day (row)
Using only CA1 (the dataframe above), I take each day (row) of 24 points and
transpose it vertically and concatenate the remaining days (rows) to one
variable, which I call dat1:
> dat1[1:48,]
H0 H1 H2 H3 H4 H5 H6 H7 H8 H9 H10 H11 H12 H13 H14 H15 H16 H17
H18 H19 H20 H21 H22 H23 H0 H1 H2 H3 H4 H5 H6 H7 H8 H9 H10 H11 H12
H13 H14 H15 H16 H17 H18 H19 H20 H21 H22 H23
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 2 2 2 4 5 5 4
7 1 1 0 0 10 13 5 1 1 3
Using the variable dat1, I input it as an argument to get a time series
data:
> rainCA1 <- ts(dat1, start = c(1900+as.POSIXlt(CA1[1,5])$year,
1+as.POSIXlt(CA1[1,5])$mon),
frequency = 24)
A few things to note:
>dim(CA1)
[1] 5636 31
>length(dat1)
[1] 135264
Thus 5636*24 (total data points [24] per row) = 135264 total points.
The length(rainCA1) agrees with the points above. However, if I put an end
in the ts function, such as
>rainCA1 <- ts(dat1, start = c(1900+as.POSIXlt(CA1[1,5])$year,
1+as.POSIXlt(CA1[1,5])$mon),
end = c(1900+as.POSIXlt(CA1[5636,5])$year, 1+as.POSIXlt(CA1[5636,5])$mon),
frequency = 24)
I get 1134 total length of points, where I am missing a lot of data. I am
assuming this is due to the dates not being consecutive and since I am only
apply the month and year as argument for the starting point.
Continuing, in what I think is the correct path, using the first ts
calculation without the end argument, I supply it as an input for stl:
>rainCA1_2 <-stl(rainCA1, "periodic")
Unfortunately, I get an error:
Error in stl(rainCA1, "periodic") : only univariate series are allowed
Which I don't understand or how to go about it. However, if I return to the
ts function and provide the end argument, stl works fine without any errors.
I have researched in a lot of forums to understand the error and I came to
the conclusion that my data is not continuous due to the gap in days, but no
one (or to my understanding) provides a well solution to obtain the data
attributes of hourly data. How can I get the function ts to accommodate my
data based on those time gaps. If anyone could help me, I will highly
appreciate it. Thank you!
--
View this message in context: http://r.789695.n4.nabble.com/Time-series-and-stl-in-R-Error-only-univariate-series-are-allowed-tp4622375.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list