[R] Function StructTS: covariance matrix of the initial state vector
javlacalle at yahoo.es
Wed Nov 5 15:33:41 CET 2014
My question is about the function "StructTS" of the core package "stats", which fits by maximum likelihood the basic structural time series model.
According to theory and the references given in "?StructTS", the covariance matrix of the initial state vector is a diagonal matrix. However, in "StructTS" it is defined as a matrix containing the same value in the diagonal and off-diagonal elements of the matrix. The matrix is defined as follows in "StructTS":
Z$P <- 1e+06 * vx
Does anybody know why is this matrix defined as non-diagonal? May it be some kind of diffuse initialization? That's not the way I understand it in textbooks though, as it still involves diagonal matrices.
I thought it may be a bug in the code. However, as "StructTS" has been defined in this way for years and as the documentation claims that the result for the "AirPassengers" data is superior to the results reported in other source, I wondered whether there may be a reason to use a non-diagonal matrix.
I used the package "stsm" [http://cran.r-project.org/package=stsm] to compare some results based on a non-diagonal and a diagonal initial covariance matrix.
# original result obtained with StructTS
res0 <- stats::StructTS(log(AirPassengers), type = "BSM")
# level slope seas epsilon
# 0.0007719 0.0000000 0.0013969 0.0000000
# StructTS can be replicated using the interface in package "stsm" as follows
mairp <- stsm.model(model = "BSM", y = log(AirPassengers), transPars = "StructTS")
res1 <- maxlik.td.optim(mairp, KF.version = "KFKSDS",
KF.args = list(P0cov = TRUE), method = "L-BFGS-B")
mairp1 <- set.pars(mairp, pmax(res1$par, .Machine$double.eps))
# var4 var1 var2 var3
# 0.0013969 0.0000000 0.0007719 0.0000000
tol = 1e-04, check.attributes = FALSE)
#  TRUE
# next, the likelihood function and the optimization procedure
# are defined as in StructTS except for Z$P, which is now a diagonal matrix
res2 <- maxlik.td.optim(mairp, KF.version = "KFKSDS",
KF.args = list(P0cov = FALSE), method = "L-BFGS-B")
mairp2 <- set.pars(mairp, pmax(res2$par, .Machine$double.eps))
# var4 var1 var2 var3
# 0.0000642 0.0001293 0.0006995 0.0000000
The results imply a different allocation of the variance across the components, as reflected by these ratios:
# var1 var2 var3
# 0.000 0.553 0.000
# var1 var2 var3
# 2.014 10.898 0.000
The estimated seasonal component for the modified "StructTS" version (i.e., the standard approach using a diagonal matrix) looks better, because the range is more homogeneous.
ss2 <- char2numeric(mairp2, P0cov = FALSE)
kf2 <- KFKSDS::KF(mairp1 at y, ss2)
ks2 <- KFKSDS::KS(mairp1 at y, ss2, kf2)
par(mfcol = c(2,2), mar = c(2.5,5.1,3,2.1))
plot(tsSmooth(res0)[,1], main = "StructTS", ylab = "level")
plot(tsSmooth(res0)[,3], ylab = "seasonal")
plot(ks2$ahat[,1], main = "Modified StructTS", ylab = "level")
plot(ks2$ahat[,3], ylab = "seasonal")
[image in attachment airp1.pdf]
Yet, I am still doubtful whether this is a bug or intentional. In some cases, I have found it useful to use P0cov=TRUE (non-diagonal matrix) when the likelihood function is evaluated by the optimization algorithm. Then, given the parameter estimates, the smoothed components can be extracted setting P0cov=FALSE (diagonal matrix) in the Kalman smoother, in order to avoid the erratic behaviour at the beginning of the sample observed in the plot above. See for example my answer in this post [http://stats.stackexchange.com/questions/115506/forecasting-a-seasonal-time-series-in-r].
I wonder if this qualifies as a bug or if there is a reason that explains the non-diagonal matrix used in "StructTS". I would be grateful for some feedback.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 43816 bytes
Desc: not available
More information about the R-help