[R] Backfitting with Missing Explanatory Values
Charlotte Maia
maiagx at gmail.com
Wed Nov 25 06:47:48 CET 2009
Hi, I just wanted to check I'm not re-inventing the wheel here.
I'm developing a new algorithm for backfitting (i.e. additive models)
and for computing partial residuals, where partial residuals are still
computed even where there are missing values. Noting additive models
here contain both linear terms and smooth terms.
If I am re-inventing the wheel could some one please let me know. I'm
kind of on my own at the moment, and don't have quite as much academic
support as I would like.
Here's an excerpt from my incomplete package (on cran), amba.
One way to think of residuals, is as some vector of values. If we
start with the response values and
subtract the overall mean, we get values with relatively high
variance. If we then subtract the fitted values
for the first term, the variance decreases. If we repeat for each
term, the variance gradually decreases,
until we are left with values with relatively low variance. In the
ideal case, the residuals would have zero
variance. If we apply certain special conditions, then it is possible
to only subtract a fitted value, where the
corresponding explanatory value is valid (i.e. not missing). Where it
is not valid, we just skip that
subtraction operation (i.e. for that particular observation, the
variance is not reduced as much). For
this to work, each explanatory variable's partial residuals for each
fit (not just the final fit) must be
zero-centered. For smoothers this isn't a big issue, however
conventional linear terms often do not satisfy
this zero-centered condition. Noting the centering condition applies
to partial residuals in relation to an
explanatory variable (not in relation to a parameter) and each
explanatory may have multiple parameters
associated with it. For our linear terms to satisfy it, we require
extra parameters. Categorical terms
require one parameter for each level, and polynomial terms, their own
intercepts.
kind regards
Charlotte
More information about the R-help
mailing list