[R] Puzzled at lm() and time-series

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Aug 23 08:53:11 CEST 2004


On Sat, 21 Aug 2004, Ajay Shah wrote:

> I tried toy problems and there doesn't seem to be a basic problem
> between lm() and ts objects:
> 
>    X = data.frame(x=c(1,2,7,9), y=c(7,2,3,1))
>    lm(y ~ x, X)
>    X <- lapply(X, function(x) ts(x, frequency=12, start=c(1994,7)))
>    lm(y ~ x, X)
> 
> and this works fine - whether you do an lm() before or after making ts
> objects, it's okay. 
> 
> But I have a situation where things aren't okay. I have two happy
> time-series objects in a data frame:
> 
> > M$g.cpi.iw
>        Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec
> 1994                                     11.07 10.94 11.20 10.31  9.81  9.47
> 1995  9.89  9.81  9.74  9.67 10.29 10.47 11.39 10.92 10.07 10.38 10.31  9.69
> ... (deleted)
> 2004  4.35  4.13  3.49  2.23  2.83  3.02    NA                              
> 
> > M$g.wpi
>        Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec
> 1994                                     11.14 11.83 11.88 12.66 13.20 14.54
> 1995 16.18 16.88 16.88 11.02 10.94  9.70  9.61  8.94  8.92  8.47  8.24  6.62
> ... (deleted)
> 2004  6.45  6.14  4.79  4.52  4.99  6.14  6.86                              
> 
> But I can't get an OLS going:
> 
> > lm(g.cpi.iw ~ g.wpi, M)
> Error in "storage.mode<-"(`*tmp*`, value = "double") : 
>         invalid time series parameters specified
> 
> Any idea why? I think both objects are quite conformable (except for
> an NA, but that should get dropped by lm() by default).

That's the problem: the row with a NA gets dropped but the tsp atribute
does not get adjusted.

BTW, try traceback() when you get an error.

lm(g.cpi.iw ~ g.wpi, data = na.omit(M))

will work, since an explicit call to na.omit does the right thing re 
attributes.  The difference is in the internal code which says

	/* need to transfer _all but dim_ attributes, possibly lost
	   by subsetting in na.action.  */
	for ( i = length(ans) ; i-- ; )
	  	copyMostAttrib(VECTOR_ELT(data, i),VECTOR_ELT(ans, i));

That's wrong in this case.

I think it was a leap to assume that you could fit linear models to time 
series via lm.  ts objects are not mentioned on the help page for lm, are 
they?  Another trap is to assume that diff() will be respected by lm.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list