[R] Non-normal residuals.

tomreilly tomreilly at autobox.com
Thu Nov 12 18:02:33 CET 2009


Kevin,

Kudos to you for asking a question that most do not....

I have attached an analysis of your residuals for "10 inch" called
10inchres.zip. I have also attached our analysis as "10inches.zip". I have
posted some reports for you and added some commentary to help you understand
this all fully.

The conclusion is the your model/methodology is not capturing the pattern in
the data properly. Worse yet it is actually creating or "injecting"
structure into the errors. In turn, the forecast that comes out of a
model/approach will be doomed.

I have copied ACF/PACF from the enclosed report "details.htm" here. It shows
that there is a "blip" at lag 3. This is may be evidence of something wrong.
Either a model that is overzealous or a model that has not captured the
structure.  Most people aren't aware that bad modeling can create issues.


  Analysis for Variable               Y    10inplates-RESIDUALS                

      LAG   ACF    STND.     T-    CHI-SQUARE &    PACF    STND.     T-         
           VALUE   ERROR   RATIO   PROBABILITY     VALUE   ERROR   RATIO        
                                                                                
        1    .037    .154     .24       .1  .8059    .037    .154     .24       
        2   -.022    .155    -.14       .1  .9597   -.023    .154    -.15       
        3   -.383    .155   -2.48      7.0  .0711   -.382    .154   -2.47       
        4   -.174    .176    -.99      8.5  .0750   -.175    .154   -1.13       
        5    .148    .180     .82      9.6  .0877    .164    .154    1.06       
        6   -.001    .183    -.01      9.6  .1429   -.179    .154   -1.16       
        7   -.006    .183    -.03      9.6  .2128   -.176    .154   -1.14       
        8   -.009    .183    -.05      9.6  .2944    .113    .154     .73       
        9   -.011    .183    -.06      9.6  .3834   -.025    .154    -.16       
       10   -.035    .183    -.19      9.7  .4694   -.222    .154   -1.44       
       11   -.053    .183    -.29      9.8  .5448   -.021    .154    -.13       
       12    .036    .183     .20      9.9  .6229    .118    .154     .76       
       13    .013    .183     .07      9.9  .6995   -.157    .154   -1.02       
       14    .080    .183     .43     10.3  .7362   -.017    .154    -.11       
       15   -.132    .184    -.72     11.5  .7132   -.050    .154    -.33       
       16   -.109    .186    -.59     12.4  .7165   -.192    .154   -1.25       
       17   -.029    .188    -.16     12.5  .7717   -.073    .154    -.47       
       18   -.018    .188    -.09     12.5  .8214   -.084    .154    -.55       
       19    .157    .188     .84     14.5  .7556   -.027    .154    -.18       
       20    .040    .191     .21     14.6  .7984   -.017    .154    -.11       
       21    .030    .191     .16     14.7  .8384   -.032    .154    -.21       
       22   -.005    .192    -.03     14.7  .8753   -.018    .154    -.12       
       23    .008    .192     .04     14.7  .9053    .082    .154     .53       
       24    .046    .192     .24     14.9  .9232    .039    .154     .25       

If you refer to stat.htm in the zip file you will see the model I pasted
here.  You will see that there are two "Seasonal Pulse" Interventions
Identified starting 12/2007 and 1/2008.  This indicates that this seasonal
effect is being missed in your model. Also, note the two "level shift"
Interventions identified at (or around) 5/08 and 4/09 indicating residuals
that are clustered on one side of the negative or positive sign.  There is
also an Autoregressive factor with a lag of 3 (see Box-Jenkins textbook for
more on ARIMA modeling).  There are a few one-time or "pulse" interventions
which reflect large or small (ie 3/09) values that are not being adjusted
for.


FORECASTING WITH FINAL MODEL  
                                                              
                                                                                
          MODEL COMPONENT       LAG    COEFF     STANDARD      P       T        
  #                            (BOP)              ERROR      VALUE   VALUE      
                                                                                
    1CONSTANT                          .154       .804E-01   .0653     1.91
    2Autoregressive-Factor #  1    3  -.711       .141       .0000    -5.04
                                                                                
  INPUT SERIES X1  I~P00035 2009/  3    PULSE                                   
                                                                                
    3Omega (input) -Factor #  2    0   3.24       .320       .0000    10.13
                                                                                
  INPUT SERIES X2  I~S00021 2008/  1    SEASP                                   
                                                                                
    4Omega (input) -Factor #  3    0   3.36       .353       .0000     9.53
                                                                                
  INPUT SERIES X3  I~L00036 2009/  4    LEVEL                                   
                                                                                
    5Omega (input) -Factor #  4    0  -.888       .159       .0000    -5.58
                                                                                
  INPUT SERIES X4  I~L00025 2008/  5    LEVEL                                   
                                                                                
    6Omega (input) -Factor #  5    0   .287       .110       .0143     2.60
                                                                                
  INPUT SERIES X5  I~P00036 2009/  4    PULSE                                   
                                                                                
    7Omega (input) -Factor #  6    0  -2.71       .373       .0000    -7.27
                                                                                
  INPUT SERIES X6  I~P00031 2008/ 11    PULSE                                   
                                                                                
    8Omega (input) -Factor #  7    0  -1.44       .338       .0002    -4.26
                                                                                
  INPUT SERIES X7  I~S00020 2007/ 12    SEASP                                   
                                                                                
    9Omega (input) -Factor #  8    0  -1.21       .224       .0000    -5.40
                                                                                
  INPUT SERIES X8  I~P00037 2009/  5    PULSE                                   
                                                                                
   10Omega (input) -Factor #  9    0  -.838       .334       .0177    -2.51
                                                                                
  INPUT SERIES X9  I~P00021 2008/  1    PULSE                                   
                                                                                
   11Omega (input) -Factor # 10    0  -2.18       .452       .0000    -4.83
                                                                                
  INPUT SERIES X 10 I~P00025 2008/  5    PULSE                                  
                                                                                
   12Omega (input) -Factor # 11    0   .648       .313       .0470     2.07




Here is our model for 10 inch plates using the historical data.  Autobox
identified a seasonal AR1 and AR12 model.  Note that the again the seasonal
pulse found at November and December appears in the model along with two
interventions. 

                                                                                
          MODEL COMPONENT       LAG    COEFF     STANDARD      P       T        
  #                            (BOP)              ERROR      VALUE   VALUE      
                                                                                
    1CONSTANT                          119.       72.9       .1113     1.63
    2Autoregressive-Factor #  1    1   .941       .557E-01   .0000    16.90
    3Autoregressive-Factor #  2   12  -.738       .220       .0019    -3.35
                                                                                
  INPUT SERIES X1  I~P00035 2009/  3    PULSE                                   
                                                                                
    4Omega (input) -Factor #  3    0   .110E+04   109.       .0000    10.12
                                                                                
  INPUT SERIES X2  I~S00020 2007/ 12    SEASP                                   
                                                                                
    5Omega (input) -Factor #  4    0  -645.       71.6       .0000    -9.01
                                                                                
  INPUT SERIES X3  I~S00019 2007/ 11    SEASP                                   
                                                                                
    6Omega (input) -Factor #  5    0  -342.       64.4       .0000    -5.31
                                                                                
  INPUT SERIES X4  I~P00033 2009/  1    PULSE                                   
                                                                                
    7Omega (input) -Factor #  6    0   297.       122.       .0197     2.44

With all of this said, you have some very difficult time series.  Using
simple and free methods may not give you what you are looking for.  Autobox
is completely automatic like R, but has the ability to recognize and adjust
for 4 types of interventions.  If you don’t adjust the model for these
interventions then the "fit" would be off as we have seen with this case
study.

Contact me or go to our website to learn more about us.

Tom Reilly
Vice President of Sales
Automatic Forecasting Systems
215-675-0652
http://www.autobox.com
tomreilly at autobox.com
skype:tomreilly at autobox.com

Here is Kevin's original post......

This is kind of a general question about methodology more than anything. But
I was looking for fome advice. I have fit a time-series model and feel
pretty confident that I have taken this model (exponential smoothing) as far
as it will go. In other words looking at the data and the fitted curves I
think it is as close as I can get. But when I plot the residuals and form a
qqplot it seems that the residuals are not "normal". From the QQ-plot there
is some factor that is influencing the series that cannot be attributed to
"noramal random" fluxuation. I can run 'tsdiag' to determine basically
whether the residuals are normall and random, but what if they are not? What
would be the next set of 'R' commands that I might run to find this
influence? 
Any suggestions? 
Kevin 




rkevinburton wrote:
> 
> Hello,
> 
> I asked a question about what the most likely process to follow if after a
> time-series fit is performed the residuals are found to be non-normal. One
> peron responded and offered to help if I supplied a sample data set.
> Unfortunately now that I have a sample I have lost the emai addressl. If
> you are that person or have some ideas please email me back at
> rkevinburton at charter.net.
> 
> Thank you.
> 
> Kevin
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
http://old.nabble.com/file/p26322376/10inches.zip 10inches.zip 
http://old.nabble.com/file/p26322376/10inchres.zip 10inchres.zip 
-- 
View this message in context: http://old.nabble.com/Non-normal-residuals.-tp26083746p26322376.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list