[R] Time series questions
kchkchkch
karihay at gmail.com
Wed Jan 18 20:29:14 CET 2012
Hi, I am trying to teach myself some time series analysis.
I have some time series data on GDP, quarterly, from 1947 to 2011. colnames
are "Year" "Quarter" "GDP" and "GDP.deflator"
The first problem I have is that 4th quarter 2010 is missing--not even NA,
there is no record for Year=2010 and Quarter =4, so instead of 260 rows, I
only have 259. To solve this, I created a temporary DF with "Year" and
"Quarter" that was complete
Yr.temp = rep(1947:2011, rep(4,65))
Qtr.temp = rep(1:4, 65)
Temp.df = data.frame(cbind(Yr.temp,Qtr.temp))
and merged the two, so now I have the NA's.
so, my DF is gdpdata with the above four columns, 260 rows.
My first question: what is the difference between
gdp.ts <- ts(gdpdata$GDP, start=1947, end=2011, fr=4)
and
gdp2.ts <- ts(gdpdata$GDP, start=c(1947,1), end=c(2011,4), fr=4)
I get different outputs for time(gdp.ts) and time(gdp2.ts), and neither make
sense.
time(gdp.ts) gives me this:
Qtr1 Qtr2 Qtr3 Qtr4
1947 1947 1947 1948 1948
1948 1948 1948 1948 1949
1949 1949 1949 1950 1950
snip
2009 2009 2009 2010 2010
2010 2010 2010 2010 2011
2011 2011
time(gdp2.ts) gives me this:
Qtr1 Qtr2 Qtr3 Qtr4
1947 1947 1947 1948 1948
1948 1948 1948 1948 1949
1949 1949 1949 1950 1950
snip
2009 2009 2009 2010 2010
2010 2010 2010 2010 2011
2011 2011 2011 2012 2012
Where did the missing values for 2011 go in gdp.ts? why are there 5 2010's
in both, and only 2 1947's?
cycle(gdp2) is correct, cycle(gdp1) is not.
My next question is with the NA in there for 2010q4. All of the (extremely
basic still learning) time series functions I've been learning don't work.
For example
m <- decompose(gdp2)
returns
Error in na.omit.ts(x) : time series contains internal NAs
I have tried
removeNA(gdp2)
and I get
Error in na.omit.ts(x, ...) : time series contains internal NAs
I have tried
na.omit(gdp2)
and I get
Error in na.omit.ts(gdp2) : time series contains internal NAs
Any insight you can share with why this is happening, whether putting the
NA's in was a good idea at all (intuitively it seems like a good idea,
because otherwise, I'd be all out of sync with anything after 2010q4), and
good instructional reading on how to handle data like this in time series
(my googlefu seems out of whack), would be extremely welcome.
I do not understand what to do with the NA's. Remove them? will my time
series functions work with the missing 2010q4? Not add them in at all?
Thanks in advance.
--
View this message in context: http://r.789695.n4.nabble.com/Time-series-questions-tp4307796p4307796.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list