[R] 'breackpoints' (package 'strucchange'): 2 blocking error messages when using for multiple regression model testing
Achim Zeileis
Achim.Zeileis at uibk.ac.at
Sat Jul 30 11:34:59 CEST 2011
On Sat, 30 Jul 2011, Michel Lutz wrote:
> Thanks a lot for your answer.
>
> You're right, I need to continue reading some papers to define a relevant
> testing strategy.
>
> Thanks to your package, I think I've got all what I need to proceed.... on
> the exception of using 'breakpoints' with my data !
>
> I've tested all possibilities (dummy or not) and I'm definitely stuck.
Thanks for the data. I'll try to have a closer look (but probably not in
the next week) and then get back to you. A quck & dirty workaround is to
add some small random fuzz to your regressors. The resulting breakpoint
estimates seem to be reliable. See below.
> You'll find here enclosed my data and a full R code showing the issue...
> if you would have a few minutes to take a look into it and tell me what
> could go wrong, it would really be fantastic!
The problem is that the model is not identified on all of the subsets that
it needs to be estimated on. This means that the (more efficient)
algorithm I'm using to compute the solution runs into numerical problems.
I try to catch these problems but apparently not successfully in all
cases.
Here's what you can do already:
## read and rescale data (for greater numerical stability)
D <- read.csv2("D.csv")
D <- transform(D,
CPU = CPU/1e5,
PREP = PREP/1e4,
BRG = BRG/1e3,
CLOG = CLOG/1e3,
DUMMY = DUMMY)
## model and F test
model1 <- CPU~PREP+BRG+CLOG+DUMMY
stab.model1 <- Fstats(model1, data = D, from = 0.1)
plot(stab.model1, alpha = 0.01)
## the implied best breakpoint is
breakpoints(stab.model1)
## so re-fitting the model manually leads to
m <- lm(model1, data = D)
m1 <- lm(model1, data = D[1:136,])
m2 <- lm(model1, data = D[-(1:136),])
cbind(coef(m), coef(m1), coef(m2))
Based on the plot of the F statistics, a single break alternative seems to
be reasonable. My feeling would be that no more breakpoints are needed.
But if we want to check with breakpoints(), then we need the quick and
dirty solution:
## read and rescale data, add some small random fuzz
D <- read.csv2("D.csv")
n <- nrow(D)
fuzz <- 1e-5
set.seed(0)
D <- transform(D,
CPU = CPU/1e5,
PREP = PREP/1e4 + runif(n, -fuzz, fuzz),
BRG = BRG/1e3 + runif(n, -fuzz, fuzz),
CLOG = CLOG/1e3 + runif(n, -fuzz, fuzz),
DUMMY = DUMMY + runif(n, -fuzz, fuzz))
## estimate breakpoints
model1 <- CPU~PREP+BRG+CLOG+DUMMY
bp.model1 <- breakpoints(model1, data = D)
plot(bp.model1)
bp.model1
which also suggests a single breakpoint solution. However, the transition
may not be completely abrupt and one may want to look at rolling
regressions or something like that additionally.
hth,
Z
> I am really sorry for asking you, but I really don't know what to do.
>
> Warm thanks, have a nice week-end.
>
> Michel
>
>
>
> On Sat, Jul 30, 2011 at 1:28 AM, Achim Zeileis <Achim.Zeileis at uibk.ac.at>
> wrote:
> On Fri, 29 Jul 2011, Michel Lutz wrote:
>
> Achim,
>
> Thank you so much for this prompt answer. Really
> appreciated !
>
> Anyway, I am still a bit lost... don't you mind if I
> ask you somme
> additional questions?
>
> * *one standard approach is to employ a
> HACcovariance matrix* I did many researches but I
> never found this recommendation. The only paper I
> know is Cadsby, Stengos (1985), proposing a
> transformation to use F-test when AR(1) errors.
> However as I'm not a statistics expert, for sure I
> missed something important. Are you aware of any
> reference paper recommending this standard approach?
>
>
> The Bai & Perron (2003, JAE) paper for example. And it's also
> discussed in Andres (1993), iirc.
>
> ** about the use of the F-test (I won't use gefp, because
> I have not studied
> this method yet)*
> Based on the example, I used the below code:
> D <- data.frame(CPU=pred.cor2$CPU, PREP=PREP,
> BRG=BIZ$JOBPREPLOTRULE_BRG,
> CLOG=res.WIP, WE=DUMMY)
> model.mes <- CPU~PREP+BRG+CLOG+WE
>
> stab.model <- Fstats(model.mes, data = D, from = 0.1,
> vcov = function(x, ...) vcovHC(x, type = "HC",
> ...))
> plot(stab.model)
>
> Here enclosed my result.
> I am a bit scared because I am not knwoledgeable about
> F-Test with HAC (so
> far: I need to read), and I've never seen so high
> F-statistics results. Does
> this mean my model is poor?
>
>
> Note that (despite the name), the statistics are typically computed in
> Chi-squared form, i.e., not standardized by the number of parameters.
> For details see vignette("strucchange-intro").
>
> ** about the function breapoints*
> I installed strucchange 1.4-5.
> I used the below code:
> bp.mes <- breakpoints(model.mes, data = D)
>
> Unfortunately, the error occured again:
> Erreur dans chol2inv(qr.R(fm$qr)) :
> l'élément (5, 5) est nul, donc l'inverse ne peut être
> calculé
>
> Why such a chol2inv issue? No missing values in my data, I
> really don't know what to do.
>
>
> I guess that this is for the model with the dummy variable, right? And
> then I would guess that there are longer sequences where the dummy is
> only zero or only 1. This makes it impossible to estimate all
> coefficients on all of the subsets. The code tries to address this
> problem but with the given information it's hard to say where.
>
> * *But the tests need to be adjusted*
> Are such adjustements implement in breakpoints? (no
> mention in the "durab"
> example, basic function settings are used).
>
>
> breakpoints() is _no_ structural change test! It computes point
> estimates.
>
> However, if you compute confidence intervals, the same principles can
> be applied. See Bai & Perron (2003) for a discussion and
> help("RealInt") for a replication of their example.
>
>
>
> In advance, thank you very much, and sorry for the
> disturbance.
>
> Michel
>
>
>
> On Fri, Jul 29, 2011 at 10:58 AM, Achim Zeileis
> <Achim.Zeileis at uibk.ac.at>wrote:
>
> Michel:
>
>
> I am encountering a blocking issue when using
> the function 'breackpoints'
> from package 'strucchange'.
>
> *Context:*
>
> I use a data frame, 248
> observations of 5 variables, no
> NA.
> I compute a linear model, as
> y~x1+...+x4
> x4 is a dummy variable (0 or 1).
>
> I want to check this model for
> structural changes.
>
>
> If you want to _test_ for structural changes,
> then you should use a test,
> i.e., apply sctest() to an Fstats(), efp(), or
> gefp(). If your errors are
> correlated, one standard approach is to employ
> a HAC (heteroskedasticity and
> autocorrelation consistent) covariance matrix.
> There is a worked example
> with Fstats() using a HC matrix in
> example("durab"). An example with gefp()
> using a HAC matrix is in example("gefp"). See
> also the vignette("sandwich",
> package = "sandwich").
>
> The breakpoints() function is for _estimating_
> (aka dating) structural
> changes, not for testing.
>
> *Process & issues:*
>
>
> *First, I used function Fstats.*
> It works perfectly. However, this
> test is
> not adapted because regression
> residuals are not independant.
>
> That is why *I used
> 'breackpoints', which works for
> depedant errors* (Bai,
> 1997).
>
>
> Yes, as for coefficient estimates in a
> regression model, the breakpoint
> estimates are still consistent. But the tests
> need to be adjusted. Note also
> the in the presence of autocorrelation, the
> standard information criteria do
> not perform well (Bai & Perron 2003).
>
> Syntax:
>
> struc.test <-
> breakpoints(y~x1+x2+x3+x3+x4,
> data=D)
>
> *I get an error message:*
> Erreur dans chol2inv(qr.R(fm$qr))
> :
> l'?l?ment (5, 5) est nul, donc
> l'inverse ne peut ?tre calcul?
>
> (sorry for the french version, I
> don't know how to get the message
> english translation in R).
>
> My first assumption was this has
> *something to do with the dummy
> variable,
> so I skipped it*:
> struc.test <-
> breakpoints(y~x1+x2+x3+x3, data=D)
>
> *New error message:*
> Erreur dans if (max(abs((betar -
> fm$coefficients)/fm$**coefficients))
> <
> tol)
> check <- FALSE :
> valeur manquante l? o? TRUE /
> FALSE est requis
>
>
> I really can't understand what is
> going wrong. What 'tol' stands
> for?
> Seems
> it is not a 'breackpoints'
> attributes.
>
>
> The breakpoints() function needs to estimate
> the model on all possible
> subsets to determine the optimal breakpoints.
> This can be done via
> computation of recursive residuals and "tol"
> is an argument of the
> recresid() function. However, I recently
> enhanced the code trying to fix
> exactly this problem. Please try strucchange
> 1.4-5.
>
> Best,
> Z
>
> Any help would greatly appreciated.
>
> Many thanks in advance,
>
> Regards,
>
> Michel
>
> [[alternative HTML version
> deleted]]
>
>
>
>
>
>
More information about the R-help
mailing list