[R] Possible improvement in lm

Wed Jan 18 13:08:00 CET 2006

Folks,

I do a series of regressions (one for each quarter in the dataset) and
then go and extract the residuals from each stored lm object that is
returned as follows:

vResiduals <- as.vector(unlist(resid(lQuarterlyRegressions[[i]])));

Here lQuarterlyRegressions is a vector of objects returned by lm().

Next, I may go find outliers using identify() on a plot or do some
other analysis which tells me which row of the quarterly data I need
to take a closer look at.

However, if I try to match some point in one of the quarters that I
have with its residual, then I have to drop the points from my
"current Data" which have NA's for either the explanatory variables or
the explained, so that the vector or residuals and the data have the
same indexes.

This lead to some serious confusion/bugs for me, and I am wondering if
it might not be better for lm to put an NA into those rows where the
point was dropped because of NA's in the explanatory or explained
variables (currently it just returns nothing at that index). Ofcourse,
there might be some arguments against this idea, and I would be
interested to hear them.

Thank you for your time and attention,

-- Vivek Satsangi
Student, Rochester, NY USA