[R] predict.loess and NA/NaN values

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Aug 30 14:50:03 CEST 2010


The underlying problem is your expectations.

R (unlike S) was set up many years ago to use na.omit as the default, 
and when fitting both lm() and loess() silently omit cases with 
missing values.  So why should prediction from 'newdata' be different 
unless documented to be so (which it is nowadays for predict.lm, 
even though you are adding to the evidence that was a mistake)?

loess() is somewhat different from lm() in that it does not in general 
allow extrapolation, and the prediction for Inf and NaN is simply 
undefined.

Nevertheless, take a look at the version in R-devel (pre-2.12.0) which 
give you more options.

On Fri, 27 Aug 2010, Philipp Pagel wrote:

>
> 	Hi!
>
> In a current project, I am fitting loess models to subsets of data in
> order to use the loess predicitons for normalization (similar to what
> is done in many microarray analyses). While working on this I ran into
> a problem when I tried to predict from the loess models and the data
> contained NAs or NaNs. I tracked down the problem to the fact that
> predict.loess will not return a value at all when fed with such
> values. A toy example:
>
> x <- rnorm(15)
> y <- x + rnorm(15)
> model.lm <- lm(y~x)
> model.loess <- loess(y~x)
> predict(model.lm, data.frame(x=c(0.5, Inf, -Inf, NA, NaN)))
> predict(model.loess, data.frame(x=c(0.5, Inf, -Inf, NA, NaN)))
>
> The behaviour of predict.lm meets my expectation: I get a vector of
> length 5 where the unpredictable ones are NA or NaN. predict.loess on the
> other hand returns only 3 values quietly skipping the last two.
>
> I was unable to find anything in the manual page that explains this
> behaviour or says how to change it. So I'm asking the community: Is
> there a way to fix this or do I have to code around it?
>
> This is in R 2.11.1 (Linux), by the way.
>
> Thanks in advance
>
> 	Philipp
>
>
> --
> Dr. Philipp Pagel
> Lehrstuhl für Genomorientierte Bioinformatik
> Technische Universität München
> Wissenschaftszentrum Weihenstephan
> Maximus-von-Imhof-Forum 3
> 85354 Freising, Germany
> http://webclu.bio.wzw.tum.de/~pagel/
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


More information about the R-help mailing list