[R] Possible improvement in lm
ggrothendieck at gmail.com
Wed Jan 18 14:33:27 CET 2006
1. Try using
lm(...whatever..., na.action = na.exclude)
2. Be sure to read the note on Using Time Series in ?lm
3. The dyn package will accept ts, irts, its and zoo class time series
and output time series for the residuals. Just preface lm with dyn$.
# test data
x <- ts(1:10, start = 2000, freq = 4)
x <- NA
y <- x + rnorm(10)
# regress series y against series x
y.lm <- dyn$lm(y ~ x)
resid(y.lm) # note that residuals are a time series
On 1/18/06, Vivek Satsangi <vivek.satsangi at gmail.com> wrote:
> I do a series of regressions (one for each quarter in the dataset) and
> then go and extract the residuals from each stored lm object that is
> returned as follows:
> vResiduals <- as.vector(unlist(resid(lQuarterlyRegressions[[i]])));
> Here lQuarterlyRegressions is a vector of objects returned by lm().
> Next, I may go find outliers using identify() on a plot or do some
> other analysis which tells me which row of the quarterly data I need
> to take a closer look at.
> However, if I try to match some point in one of the quarters that I
> have with its residual, then I have to drop the points from my
> "current Data" which have NA's for either the explanatory variables or
> the explained, so that the vector or residuals and the data have the
> same indexes.
> This lead to some serious confusion/bugs for me, and I am wondering if
> it might not be better for lm to put an NA into those rows where the
> point was dropped because of NA's in the explanatory or explained
> variables (currently it just returns nothing at that index). Ofcourse,
> there might be some arguments against this idea, and I would be
> interested to hear them.
> Thank you for your time and attention,
> -- Vivek Satsangi
> Student, Rochester, NY USA
> R-help at stat.math.ethz.ch mailing list
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
More information about the R-help